back to indexWindsurf: The Enterprise AI IDE

Chapters
0:0 Introductions & Catchup
3:52 Why they created Windsurf
5:52 Limitations of VS Code
10:12 Evaluation methods for Cascade and Windsurf
16:15 Listener questions about Windsurf launch
20:30 Remote execution and security concerns
25:18 Evolution of Codeium's strategy
28:29 Cascade and its capabilities
33:12 Multi-agent systems
37:2 Areas of improvement for Windsurf
39:12 Building an enterprise-first company
42:1 Copilot for X, AI UX, and Enterprise AI blog posts
00:00:07.480 |
- This is Alessio, partner and CTO at Decibel Partners. 00:00:16.000 |
I think the first podcast in the new Codium office. 00:00:18.560 |
So thanks for having us and welcome Farun and Anshul. 00:00:27.960 |
- The story is that the office was previously, 00:00:34.200 |
And I think a lot of the people at the company previously, 00:00:39.960 |
And actually one thing if you notice about the company 00:00:42.160 |
is it's actually like a two minute walk from the Caltrain. 00:00:44.520 |
And I think we were, we didn't want to move the office 00:00:50.240 |
piss off a lot of the people that lived in San Francisco, 00:00:54.820 |
So we were like scouting a lot of spaces in the nearby area. 00:01:04.000 |
And then immediately after that, Ghost Autonomy. 00:01:09.560 |
I guess one of the things that the landlord told us was 00:01:11.400 |
this was the place that they shot all the scenes 00:01:17.400 |
Trust me, that wasn't like the main reason why we did it. 00:01:39.720 |
Like what's been the progress over the last year or so 00:01:45.380 |
So I think the biggest things that have happened 00:01:47.040 |
are Codiums extensions have continued to gain 00:01:52.280 |
You know, we have over 800,000 sort of developers 00:01:56.240 |
Lots of large enterprises also use the product. 00:02:01.720 |
which is usually not something a company gets, 00:02:03.880 |
you know, within a year of deploying an enterprise product. 00:02:06.440 |
And then large companies like Dell and stuff use the product. 00:02:18.560 |
one of the things that we've always thought about is 00:02:23.920 |
The reason why we started out with the extension system 00:02:25.800 |
was we felt that there were lots of developers 00:02:57.360 |
all the dependent systems on this workflow software. 00:02:59.600 |
It's much harder than even switching off of a database. 00:03:04.440 |
we could be better partners to our customers, 00:03:06.080 |
regardless of where they started their source code. 00:03:08.000 |
And then more specifically on the IDE category, 00:03:11.560 |
don't just write TypeScript and Python, right? 00:03:20.200 |
Very honestly, JetBrains has the best debugger for Java. 00:03:24.860 |
These are extremely complex pieces of software. 00:03:36.520 |
you know, we were running into the limitations 00:03:41.600 |
And I think we felt that there was an opportunity 00:03:44.620 |
for us to build a premier sort of experience. 00:03:47.440 |
And that was within the reach of the team, right? 00:03:51.660 |
to build the best possible experience, right? 00:04:07.200 |
We were like, hey, if we launch this agentic product 00:04:13.320 |
it's just gonna limit the value of the product 00:04:14.920 |
and we're just not gonna be able to do the best tool. 00:04:16.840 |
That's why we were super excited to launch Windsurf. 00:04:19.360 |
I do think it is the most powerful IDE system 00:04:25.440 |
I think we suspect that there's much, much more we can do, 00:04:28.720 |
more than just the auto-complete sort of side, right? 00:04:31.900 |
probably auto-complete was the only piece of functionality 00:04:37.560 |
These systems can now reason about large code bases 00:04:42.800 |
do you say like @NewYorkTimesPost, blah, blah, blah, 00:04:49.640 |
We want it to actually go out and execute code. 00:04:51.160 |
We think code execution is a really, really important piece. 00:04:55.800 |
you not only just kind of come up with an idea, 00:04:59.960 |
is software is originally this amorphous blob. 00:05:13.960 |
the AI just creates the mountain for you, right? 00:05:16.080 |
And that's why we don't believe in this sort of modality 00:05:24.200 |
And I'll let Anshul talk about that a little bit. 00:05:32.960 |
but I think more in the process of actually evolving code 00:05:48.200 |
You're killing ideas and creating ideas constantly. 00:05:50.880 |
And we think Windsor is the right paradigm for that. 00:05:53.040 |
- Can you spell out what you couldn't do in VS Code? 00:05:57.000 |
Because I think when we did the cursor episode, 00:06:05.560 |
Like, can you maybe just explain more of those limitations? 00:06:18.480 |
okay, what are the pieces that we actually need to give 00:06:20.640 |
the AI to get to that kind of emergent behavior 00:06:29.320 |
that we've been building for the enterprise all this time. 00:06:33.600 |
You know, we were talking about all the different tools 00:06:36.000 |
so they can go like do that kind of terminal execution, 00:06:42.480 |
where you're not out there writing out a PRD. 00:06:46.680 |
Is that if we're actually being able to understand 00:06:49.240 |
of what developers are doing within the editor, right? 00:06:55.440 |
this part of the directory and tried to view it, 00:06:58.280 |
And they tried to do like some kind of commands 00:07:00.760 |
And if we actually understand that trajectory, 00:07:02.800 |
then our ability for the AI to just be immediately be like, 00:07:07.720 |
without you having to spell it all out for it. 00:07:12.360 |
I think that was kind of like that intuition. 00:07:19.840 |
what we actually need to be able to hook into 00:07:22.680 |
And I think it was that combination of those two 00:07:26.840 |
The editor was not like a necessarily like a new idea. 00:07:32.720 |
we just pulled it all together in the last couple of months, 00:07:35.080 |
but it was always something in the back of the mind. 00:07:38.320 |
okay, the models are not capable of doing this. 00:07:42.240 |
Like we have a really good context awareness system. 00:07:50.760 |
but it's like how you brought it all together. 00:07:52.840 |
It's like the VS Code is kind of like sandbox, so to speak. 00:07:55.240 |
- Yeah, let me maybe like even just to go one step deeper 00:07:58.240 |
on each of the aspects that Anshul talked about. 00:08:04.880 |
that I think is like very exciting about the product, right? 00:08:09.720 |
I think you can do it quickly and very powerfully. 00:08:14.440 |
wasn't actually being able to implement the feature. 00:08:17.320 |
Problem was actually even to show the feature, 00:08:19.000 |
VS Code would not expose an API for us to do this. 00:08:30.360 |
to actually go out and implement this, right? 00:08:32.360 |
And that wasn't because we were bad engineers. 00:08:34.200 |
No, our good engineering time was being spent 00:08:36.080 |
fighting against the system rather than being a good system. 00:08:41.760 |
The VS Code API would constantly keep breaking on us. 00:08:44.160 |
We'd constantly need to show a worse and worse experience. 00:08:50.080 |
we can come up with great work and great research. 00:08:53.840 |
the research on Cascade is not like a couple month thing. 00:09:01.640 |
Even the evals for this are a lot of effort, right? 00:09:04.400 |
A lot of actually systems work to actually go out and do it. 00:09:09.360 |
And I think, let's even go for Cascade, for example, 00:09:14.560 |
because that's the first time you brought it up? 00:09:17.600 |
that is the actual agentic part of the product, right? 00:09:23.480 |
from both these human trajectories and these AI trajectories, 00:09:27.880 |
to actually propose changes and actually execute code 00:09:30.800 |
to finally get you the final work output, right? 00:09:43.840 |
And we think that this is like a fundamental building block 00:09:45.960 |
for us to make the product materially better. 00:09:47.840 |
If people are not even willing to use the building block, 00:09:55.440 |
Interestingly, JetBrains is a much more configurable paradigm 00:10:01.960 |
on both the sort of directions that Anshul said, 00:10:05.840 |
"Hey, if we actually remove these limitations, 00:10:10.160 |
And we believe that this was a necessary step for us. 00:10:13.200 |
- I'm curious more about the evals side of it, 00:10:21.120 |
that is so multi-step and so spanning so much context? 00:10:28.680 |
and this is like one of the beautiful things about code 00:10:31.840 |
We could go take a bunch of open source code, 00:10:35.880 |
And we can actually see if some of these commits 00:10:41.680 |
and the approach of stripping the commits is good 00:10:48.120 |
the goal is not the commit has already been written for you, 00:10:51.120 |
that where the entire thing has not been written. 00:10:52.920 |
And can we go out and actually retrieve the right snippets 00:11:07.960 |
And you can see on every single one of these axes 00:11:10.560 |
And if you do this across enough repositories, 00:11:16.880 |
versus make it not work into a continuous problem. 00:11:19.120 |
And now that's a hill you can actually climb, 00:11:21.080 |
and that's a way that you can actually apply research 00:11:23.080 |
where it's like, "Hey, my retrieval got way better. 00:11:27.040 |
And then notice how the way the eval works is 00:11:34.080 |
I'm more interested in the code is in an incomplete state, 00:11:36.920 |
and the commit message isn't even given to you 00:11:38.840 |
because that's another thing about developers. 00:11:43.080 |
That's the actual important piece of this problem. 00:11:47.360 |
completely pose the problem statement, right? 00:11:50.720 |
Because the problem statement lives in their head. 00:11:52.600 |
Conversations that you and I have had at the coffee area, 00:12:08.440 |
to actually finally propose sort of a solution there, 00:12:10.920 |
which is why we want to test the incomplete code. 00:12:13.120 |
What happens if the state is in an incomplete state 00:12:15.280 |
and am I actually able to make this pass without the commit? 00:12:23.120 |
where you want to guess both the high-level intent 00:12:28.120 |
And you can imagine if you build up all of these, 00:12:30.040 |
now you can see, hey, my systems are getting better. 00:12:35.920 |
And I guess that's one thing that we, honestly, 00:12:38.400 |
to be honest, we could have done a little faster. 00:12:41.720 |
and build these zero-to-one apps very quickly. 00:12:43.200 |
And I think people are using Windsurf to actually do that. 00:12:50.360 |
It's actually that you take a large code base, 00:13:00.440 |
that we are getting better on this dimension. 00:13:02.160 |
- We mentioned the end-to-end evals that we have 00:13:04.240 |
for this system, which I think are super cool. 00:13:08.280 |
The ideas of, just take a retrieval, for example, right? 00:13:11.240 |
Like, how can we make eval for retrieval really good? 00:13:15.840 |
that's been true about us as a company is like, 00:13:17.560 |
most evals and benchmarks that exist out there 00:13:22.800 |
There's not really a better way of putting it. 00:13:26.840 |
No actual professional work looks like Sweebench. 00:13:31.400 |
These things are just a little kind of broken. 00:13:33.080 |
So when you're trying to optimize against a metric 00:13:36.320 |
you end up making kind of suboptimal decisions. 00:13:38.120 |
So something that we're always very keen on is like, 00:13:40.120 |
okay, what is the actual metric that we want to test 00:13:45.000 |
A lot of the benchmarks for these embedding-based systems 00:13:49.280 |
Like, I want to find this one particular piece 00:13:51.880 |
of information out of all this potential context. 00:13:57.560 |
because code is a super distributed knowledge store. 00:14:13.120 |
And are you capturing all of the necessary pieces for that? 00:14:24.240 |
because those are semantically similar things 00:14:27.880 |
if you actually try to map out a code graph, right? 00:14:29.960 |
And so we can actually build these kind of golden sets. 00:14:32.000 |
We can do this evaluation even for sub-problems 00:14:40.400 |
that we're trying to build too is really, really strong 00:14:42.400 |
so that we have confidence in what we're pushing out. 00:14:52.280 |
Actually, I would prefer if there were benchmarks 00:14:54.080 |
versus, let's say, everything was just vibes, right? 00:14:56.640 |
But vibes are also very important, by the way, 00:14:58.560 |
because they showcase where the benchmark is not valuable 00:15:02.760 |
where criminal issues exist in the benchmark. 00:15:08.600 |
It's like, make sure to run PyTest every time X happens. 00:15:12.520 |
You can start prompting it in every single possible way. 00:15:15.760 |
And if you remove that, suddenly it doesn't get good at it. 00:15:19.920 |
What really matters here is across a broad set of tasks, 00:15:23.000 |
you're performing high-quality suggestions for people, 00:15:27.800 |
And I think, actually, the way these things work 00:15:33.920 |
But once it starts hitting the peak of these benchmarks, 00:15:36.200 |
getting that last 10% actually probably is counterintuitive 00:15:39.640 |
to the actual goal of what the benchmark was. 00:15:41.760 |
Like, you probably should find a new hill to climb 00:15:43.880 |
rather than sort of p-hacking or really optimizing 00:15:50.920 |
about their recent SuiteAgent, SuiteBench results. 00:15:54.320 |
And we talked about the human eval versus SuiteBench. 00:15:56.960 |
Or human eval is kind of like a Greenfield benchmark. 00:16:02.200 |
But it sounds like, I mean, your eval creation 00:16:04.960 |
is similar to SuiteBench as far as using get up commits 00:16:09.240 |
But then it's more like masking at the commit level 00:16:18.560 |
And obviously, I also want to give you the chance 00:16:25.440 |
- Hey, let me tell you something very, very interesting. 00:16:28.200 |
I love Hacker News as much as the next person. 00:16:35.200 |
the first comment was, "This product is a virus." 00:16:38.560 |
- This was the original Codium launch like two years ago. 00:16:41.360 |
- Like, "I am analyzing the binary as we speak. 00:16:47.720 |
And I was like, "Dude, like, it's not a virus." 00:16:51.480 |
- We just want to give autocomplete suggestions. 00:17:10.040 |
to them, Cascade already felt pretty agentic. 00:17:12.720 |
Like, is that something you want to do more of? 00:17:14.960 |
You know, obviously, since you just launched on IDE, 00:17:20.480 |
But maybe this is kind of like the Trojan horse 00:17:22.600 |
to just do more full-on, end-to-end, like, code creation. 00:17:27.640 |
- Yeah, I think it's, like, how do you get there 00:17:32.960 |
We have, obviously, enterprise asking us all the time, 00:17:35.040 |
like, "Oh, when's it going to be end-to-end work?" 00:17:39.720 |
can see your entire actions and get a lot of intent 00:17:41.920 |
that you can't actually get if you're not in the IDE, 00:17:44.320 |
if the agent there has to always get human involvement 00:17:49.000 |
to keep on fixing itself, it's probably not ready 00:17:51.600 |
to become a full end-to-end automated system, 00:17:53.520 |
'cause then we're just going to turn into a linter 00:18:11.080 |
and we'll learn what those tasks are pretty quickly, 00:18:22.440 |
The answer is this product should become fully agentic, 00:18:38.840 |
that have yet to be solved that we need to go out 00:18:42.200 |
Like, for instance, I think one of the most annoying parts 00:18:44.120 |
about the product is the fact that you need to accept 00:18:51.240 |
Unfortunately, me going out and running arbitrary binaries 00:19:03.120 |
I think this is solvable with, like, with complex systems. 00:19:06.440 |
I think we love working on complex systems infrastructure. 00:19:10.400 |
Now, the simpler way to go about solving this is, 00:19:12.600 |
don't run it on the user's machine and run it somewhere else 00:19:17.440 |
Now, I think, though, maybe there's a little bit 00:19:19.840 |
of trade-off of, like, running it locally versus remotely, 00:19:21.760 |
and I think we might change our mind on this, 00:19:23.600 |
but I think the goal for this is not for this 00:19:26.440 |
I think the goal for this is, A, it's actually able to do 00:19:28.960 |
very complex tasks with limited human interaction, 00:19:30.960 |
but it needs to know when to actually go back to the human. 00:19:37.800 |
Right now, actually, I even feel like the product 00:19:47.560 |
So there is, like, systems work and probably modeling work 00:19:50.120 |
that needs to happen there to make the product even faster 00:19:52.520 |
on both the retrieval side and the generation side, right? 00:19:55.160 |
And then finally speaking, I think another key piece here 00:19:57.600 |
that's, like, really important is I actually think 00:20:01.840 |
is probably going to be more of an anti-pattern 00:20:08.040 |
So almost imagine, as the user is using the product, 00:20:10.880 |
that we're gonna suggest the remainder of the PR 00:20:13.040 |
without the user kind of, like, even asking us for it. 00:20:16.280 |
I think this is sort of the beginning for it, 00:20:21.480 |
I think this is, like, a big step up than what we had, 00:20:28.000 |
but the goal is for us to get better at this. 00:20:30.360 |
- I mean, the remote execution thing is interesting. 00:20:35.600 |
- And that's almost like, then we were kind of like, 00:20:40.400 |
But now it's like, okay, no, actually, I don't really care. 00:20:55.800 |
that it's possible that everything could run remotely. 00:20:59.320 |
- That's how it is at most, like, big companies, 00:21:01.600 |
like, Facebook, like, nobody runs things locally. 00:21:08.440 |
Maybe the one thing that I do think is kind of important 00:21:11.240 |
for these systems that is more than just running remotely 00:21:15.960 |
there's kind of, like, a rollout of a trajectory. 00:21:18.120 |
And I kind of want to roll this trajectory back, right? 00:21:20.280 |
In some ways, I want, like, a snapshot of the system 00:21:25.760 |
I might want to do multiple rollouts of this. 00:21:28.160 |
So basically, I think there needs to be a way 00:21:29.680 |
to almost, like, move forward and move backwards the system. 00:21:35.960 |
But every time, if you move the system forward, 00:21:39.640 |
It's probably going to be a hard system to kind of, 00:21:46.520 |
I think you still need to solve the problem of 00:21:48.440 |
this thing is not going to destroy your machine 00:21:53.360 |
There is a category of emerging infrastructure providers 00:21:58.880 |
- And if Verne's first episode on this podcast 00:22:01.120 |
was an indication, we like infrastructure problems. 00:22:10.160 |
actual model inference, optimization, all these things. 00:22:16.720 |
people are, like, forgetting about the model. 00:22:27.600 |
I think I would be lying if I said it hasn't. 00:22:30.400 |
The things, like, autocomplete and supercomplete 00:22:32.120 |
that run on every keystroke are entirely, like, 00:22:35.080 |
And by the way, that is still because properties, 00:22:44.000 |
Non-existent, they're not good, actually, at it. 00:22:49.360 |
- It's how you order the tokens, actually, in some ways. 00:22:53.120 |
if you look at what these products have sort of become, 00:23:03.320 |
So multi-turn, kind of, back-and-forth systems. 00:23:14.800 |
are doing multi-point, kind of, like, conversations, 00:23:19.600 |
is not, like, even a perfect science, still yet. 00:23:23.240 |
The second piece where we've actually, sort of, 00:23:25.440 |
trained our models is actually on the retrieval system. 00:23:28.880 |
but, like, actually being able to use high-powered LLMs 00:23:31.600 |
to be able to do much higher-quality retrieval 00:23:37.960 |
For a lot of the systems, we do believe embeddings work, 00:23:46.560 |
Like, imagine I have a question on a code base of, 00:23:49.520 |
find me all quadratic-time algorithms in this code base. 00:23:59.440 |
extremely poor precision recall at this task. 00:24:01.640 |
So we need to apply something a little more high-powered 00:24:04.760 |
So we've actually built, like, large distributed systems 00:24:09.400 |
run custom models at scale across large code bases. 00:24:16.120 |
I think the CLODS and the OpenAIs have the best products. 00:24:22.200 |
It's very clear that they're willing to invest 00:24:28.960 |
I would be very happy if they got really good, 00:24:42.920 |
I was part of the preview, thanks for letting me in, 00:24:44.880 |
and I've been maining Windsurf for a long time. 00:24:52.060 |
so that, like, I feel like it has more differentiation. 00:24:55.520 |
Like, I only have exclusive access to your models 00:24:58.040 |
via your IDE than having the dropdown that says 00:25:05.120 |
is the high-level planning that is going on in the model 00:25:07.480 |
is actually getting done with products like the Cloud, 00:25:12.160 |
as well as the ability to, like, take the high-level plan 00:25:16.040 |
is proprietary systems that are running internally. 00:25:21.940 |
are you familiar with the concept of late interaction? 00:25:25.200 |
- Yeah, so this is Colbert, or the guy, Omar Khattab, 00:25:28.680 |
from, I think, Stanford, has been promoting this a lot. 00:25:34.280 |
- Sort of embedding on retrieval rather than pre-embedding. 00:25:49.880 |
- Well, I mean, there might be something to learn 00:25:52.680 |
from contrasting the ideas and seeing where-- 00:25:59.840 |
Because vision models tend to just consume the whole image, 00:26:14.880 |
over a whole set of raw data rather than a materialized view 00:26:19.440 |
I think it's just like, how does that look like for LLMs? 00:26:21.620 |
- When I hear you say build large distributed systems, 00:26:30.080 |
Is it the same in front that serves everything? 00:26:34.680 |
And the only reason why for us the answer is yes. 00:26:37.440 |
And to be honest, our company is a lot more complex 00:26:39.800 |
than I think if we just wanted to serve the individual. 00:26:42.440 |
And I'll tell you that because we don't really pay 00:26:45.000 |
other providers to do things for our indexing. 00:26:47.600 |
We don't pay other providers to do our serving 00:26:52.960 |
And I think that's a core competency within our company 00:26:56.080 |
But that's also enabled us to go and make sure 00:27:00.160 |
in an environment that works for these large enterprises, 00:27:03.400 |
we need to build this custom system for you guys. 00:27:05.440 |
This is the same system that serves our entire user base. 00:27:08.480 |
So that is a very unique decision we've taken as a company. 00:27:11.680 |
And we admit that there are probably faster ways 00:27:15.400 |
- I was thinking, you know, when I was working with you 00:27:16.840 |
for your enterprise piece, I was thinking like 00:27:20.180 |
like build deliberately for the right level of abstraction 00:27:23.440 |
that can serve the market that you really are going after. 00:27:26.200 |
- Yeah, I mean, I would say like I was writing, 00:27:28.120 |
when writing that piece, you're like looking back 00:27:29.920 |
and reading it back, it sounds so like almost obvious 00:27:32.280 |
and not all of those are really conscious decisions we made, 00:27:48.040 |
well, we have hundreds of thousands of developers 00:27:51.680 |
Like, I think we'll be able to support you, right? 00:27:56.920 |
let's give it to individuals, let's see what people like 00:28:02.800 |
- And to recap, when you first came on the pod, 00:28:12.080 |
is building things on top of code completion. 00:28:17.360 |
on like short-term kind of like growth monetization 00:28:19.680 |
of like the individual developer and like build some of this 00:28:34.640 |
and unclear if the commercial instinct is right. 00:28:36.840 |
I think that right now, optimizing for making money 00:28:44.920 |
Largely because I think individual developers 00:28:46.600 |
can switch off of products like very quickly. 00:28:51.400 |
trying to optimize for making a lot of profit 00:29:00.360 |
And I'm going to say this very honestly, right? 00:29:10.120 |
as the products get better and better and deeper and deeper. 00:29:13.480 |
like there's a book in business called like seven powers. 00:29:16.240 |
And I think one of the powers that a business like ours 00:29:20.080 |
But like you first need something in the product 00:29:24.200 |
before you think about how do you make people switch off. 00:29:28.400 |
we believe that there's probably much more differentiation 00:29:51.040 |
where they're already spending billions of dollars 00:29:54.480 |
So you can actually solve maybe deeper problems 00:29:56.520 |
for them and you can actually kind of provide 00:30:03.160 |
could be churny as long as we don't have the best product. 00:30:09.600 |
And I don't think we will, for the foreseeable future, 00:30:12.600 |
try to be a company that tries to make a lot of money 00:30:44.120 |
We give a lot of things actually out for free. 00:30:50.480 |
and all these things, there is real cogs here. 00:31:04.080 |
So for Windsurf, it just ended up being the same thing. 00:31:07.000 |
And everyone who downloads Windsurf in the first, 00:31:13.200 |
let us know what they like, what they don't like. 00:31:16.360 |
- I've talked to a lot of CTOs in the Fortune 100 00:31:23.080 |
The problem is not that the developer costs 200K 00:31:28.120 |
It's like that developer should not be paid 200K. 00:31:33.280 |
But then you have developers getting paid 200K, 00:31:36.960 |
So it's almost like you're averaging out the price 00:31:40.160 |
because most people are actually not that productive anyway. 00:31:47.440 |
is it that the junior developer salary is like 50K? 00:31:56.840 |
- Yeah, maybe Alessio, one thing that I think about a lot, 00:31:58.960 |
because I do think about this, the per seat, anything, 00:32:01.520 |
all of this stuff, I think about a good deal. 00:32:11.680 |
from Office 365 is probably tens of thousands of dollars. 00:32:15.400 |
By the way, everyone, Google Docs, great product. 00:32:20.480 |
It made it so that the moment you review anything 00:32:22.560 |
in Microsoft Word, the only way you can review it 00:32:26.080 |
It's like this virus that penetrates everything. 00:32:28.160 |
And it not only penetrates it within the company, 00:32:31.800 |
The amount of value it's driving is way higher for him. 00:32:35.800 |
there's always going to be, for these kinds of products, 00:32:37.880 |
this variance between who gets value from these products. 00:32:41.120 |
And you're right, it's almost like a blended. 00:32:44.160 |
Probably this company should be paying that one developer 00:32:47.720 |
But in a weird way, software is this team activity enough 00:32:52.440 |
But hey, 20% of the four times, and there are four people, 00:33:02.960 |
this is about the future of the software engineer. 00:33:12.720 |
- I mean, business model does impact the product, 00:33:18.360 |
We are, we do, are as concerned about the business of tech 00:33:23.360 |
- Speaking of which, there's other listener questions. 00:33:32.240 |
especially from like the Microsoft research point of view. 00:33:49.040 |
and you can kind of pick the most interesting one. 00:33:56.520 |
actually, is partially because we can't go out 00:33:58.520 |
and execute some random stuff in peril in the meantime. 00:34:04.320 |
So there are some things that are a little bit 00:34:09.920 |
And then the other thing is, in the short term, 00:34:11.920 |
I think there is like also a latency component. 00:34:13.800 |
And I think all of these things can kind of be solved. 00:34:15.840 |
I actually believe all of these things are solvable from. 00:34:18.800 |
And if you want to run all of them in parallel, 00:34:20.640 |
you probably don't want end machines to go out and do it. 00:34:23.800 |
especially if most of them are I/O bound kind of operations 00:34:26.560 |
where all you're doing is reading a little bit of data 00:34:39.600 |
So you can, so for a certain class of concurrency, 00:34:42.080 |
you can actually just run it all in one machine. 00:34:44.480 |
Because if you look at the changes that are made, right? 00:34:48.160 |
it's writing out like what, a couple of thousand bytes? 00:34:50.320 |
Maybe like tens of thousands of bytes on every, 00:34:58.840 |
We did an internal poll and we were just like, 00:35:04.360 |
Or like what we're going to come out with in a month? 00:35:08.040 |
I think like, you know, there's some like obvious ones. 00:35:13.920 |
but I think you'd look at all the same axes of the system, 00:35:18.000 |
Like how can we improve the knowledge retrieval? 00:35:22.920 |
we even showed some of like the early explorations 00:35:24.840 |
we have about looking into other data sources. 00:35:28.280 |
to the individual developer building a zero to one app, 00:35:36.880 |
I think there's a whole lot more that we can do. 00:35:41.280 |
the terminal command, but actually executing them. 00:35:49.600 |
like how can we make that even more detailed? 00:35:55.840 |
like the idea of looking at future trajectories, 00:35:59.640 |
and like suggesting potential next actions to be taken. 00:36:12.160 |
We kind of joke that's like Clippy's coming back, 00:36:14.000 |
but like maybe now's the time for Clippy to really shine, 00:36:17.240 |
So I think there's a lot of ways that we can take this, 00:36:19.120 |
which I think is like the very exciting part. 00:36:21.000 |
We're calling each of our launches waves, I believe, 00:36:23.000 |
because we want to really double down on the aquatic themes. 00:36:25.800 |
- Oh yeah, does someone actually windsurf for the company? 00:36:30.440 |
- We're living out our dream of being cool enough 00:36:37.960 |
'cause I don't think any of us are windsurfers. 00:36:40.560 |
we have someone like using windsurf on a windsurf. 00:36:44.720 |
- You saw that in the beginning of the video. 00:36:48.720 |
where there's like not enough wind to windsurf. 00:36:50.960 |
So we were trying to figure out how to do this like, 00:36:52.760 |
you know, launch video with windsurf on the windsurf. 00:36:57.240 |
And there was like one crazy guy who was like, 00:37:14.160 |
that I think could be more polished about the product 00:37:25.200 |
Like, hey, like, if you have this environment, 00:37:44.880 |
It's like, yeah, like the virtual environment, 00:37:54.340 |
So we would love to hear like all the feedback 00:37:59.280 |
What kind of environments could it work way more in? 00:38:02.760 |
We, luckily, we're daily users of the product internally. 00:38:17.060 |
and probably a lot of improvements down the line. 00:38:31.120 |
- Your customers, everyone says it's 80, 90% are on Windows. 00:38:36.760 |
You will never not see something that missed. 00:38:41.840 |
part of the reason why we were hesitant to do that 00:38:50.880 |
of running the system on the user's local machine 00:39:02.600 |
if I can also make changes to the UI and stuff like that. 00:39:07.400 |
That's actually something that we need to add to the product. 00:39:09.400 |
That's how early it is that we have not actually added that. 00:39:16.360 |
You still have your core business of the enterprise Codium. 00:39:31.480 |
are going to switch to Windsurf and this is the only, 00:39:34.600 |
- You just talked about your Java guys loving JetBrains. 00:39:40.560 |
There's still tons and tons of enterprise people on Eclipse. 00:39:48.640 |
And but like, that's because that's our enterprise customers. 00:39:50.640 |
And the way that we always think about it is like, 00:40:13.220 |
it's like, if we can solve all the enterprise problems 00:40:18.100 |
that developers themselves just truly, truly love, 00:40:21.400 |
then we're solving the problem from both sides. 00:40:25.100 |
I think when we started working with the enterprise 00:40:26.900 |
and we started building like dev tools, right? 00:40:32.940 |
You really quickly understand and realize just 00:40:39.740 |
There's a lot of enterprise software that developers hate. 00:40:45.420 |
where they're doing their most important work. 00:40:48.340 |
And it's not like, we're like trying to convince, 00:40:52.300 |
also ask their developers a lot, do you love this? 00:40:54.340 |
Like, that is like, almost always a key aspect 00:41:00.060 |
I don't think we go from zero to 10 million ARR 00:41:04.420 |
if we don't have a product that developers love. 00:41:07.940 |
the IDE is more of a developer love kind of play. 00:41:10.500 |
It will eventually make it to the enterprise. 00:41:13.500 |
And again, we could be completely wrong about this, 00:41:16.020 |
but we hope we're solving the right problems. 00:41:18.940 |
before we started rolling, but like, it's the same team. 00:41:22.820 |
Like I, in any normal company or like, you know, 00:41:26.340 |
my normal mental model of company construction, 00:41:28.420 |
if you were to have like effectively two products like this, 00:41:32.620 |
serving two different needs, but it's the same team. 00:41:36.220 |
that's maybe unique about our company is like, 00:41:38.300 |
this has not been one company the whole time, right? 00:41:41.140 |
Like we were first, like this GPU virtualization company 00:41:45.020 |
And then after that, we're making some changes. 00:41:46.860 |
And like, I think there's like a versatility of the company 00:41:49.980 |
and like this ability to move where we think the instinct, 00:41:56.020 |
but if we smell something, we're going to move fast. 00:41:59.700 |
to I think the engineering team rather than any one of us. 00:42:11.300 |
Estimate inference to figure out latency quality. 00:42:14.060 |
Build first party instead of using third party as API. 00:42:17.020 |
Figure out real time because ChadGBT and Dali, 00:42:23.220 |
Optimize prompt because context window's limited, 00:42:38.540 |
Even like the context, like the one that you called out, 00:42:45.900 |
They have like tens of millions of lines of code. 00:42:53.460 |
to piece together this like distributed knowledge 00:43:11.220 |
that you can't just prompt engineer your way out of 00:43:13.500 |
or just maybe even like fine tune afterwards. 00:43:21.340 |
like Cascade and Windsurf would not have been possible 00:43:31.020 |
I'll say I passed, but yes, two hours, two years later. 00:43:48.900 |
Swiggs and I were talking like maybe we can write this 00:43:54.100 |
- I specifically like the Co-Pilot for X thing 00:44:04.900 |
I don't even think we were like necessarily thinking 00:44:06.620 |
of an enterprise product at that point, right? 00:44:08.420 |
So like all of the learnings that like, you know, 00:44:16.420 |
Some of those, I think we kind of like figured, 00:44:18.780 |
some of those we just honestly walked backwards into. 00:44:29.580 |
for a variety of reasons that we had to like learn from. 00:44:33.100 |
that there's no way I would have gotten that right in 2022. 00:44:39.900 |
but it's like true about our engineering team as a whole. 00:44:42.100 |
I don't think most of us got much value from ChaiGBT. 00:44:47.020 |
and this is maybe a little bit of a different thing. 00:44:49.020 |
It's like a lot of the engineers at the company 00:44:51.020 |
who have been writing software for like over eight years. 00:45:00.980 |
Invested a lot in searching code base, right? 00:45:07.620 |
And they've spent like eight years mastering that skill. 00:45:12.300 |
that you need to provide a lot of context to, 00:45:16.060 |
like my co-founder just basically never used ChaiGBT at all. 00:45:21.940 |
one of our incorrect sort of assumptions was probably that, 00:45:25.100 |
hey, like a lot of these passive systems need to get good 00:45:27.860 |
and these active systems are going to be behind. 00:45:32.140 |
Is there a company where everyone is now using it? 00:45:52.620 |
They will not form a cult of this is awesome. 00:45:56.500 |
They were not going to be the kind of people on Twitter 00:45:58.100 |
that are like, yeah, this changes everything. 00:46:02.380 |
No, there are people that are going to be incredibly honest. 00:46:04.700 |
And we know if we hit the bar that is good for them, 00:46:10.220 |
we probably had a lot of sentiment like that. 00:46:13.660 |
And I think it's actually important that you have 00:46:27.140 |
And there's no signal to kind of bring you down to reality. 00:46:31.900 |
And there are a lot of ideas we're going to come up with 00:46:36.340 |
Otherwise, like how does anything good come out? 00:46:41.740 |
They're the type of people that when they see 00:46:49.820 |
By the way, we will never launch with a waitlist. 00:46:59.420 |
- My joke is generative AI has gotten really good 00:47:06.420 |
both of us used to work in autonomous vehicles, 00:47:27.060 |
But you now are sitting on so much proprietary data 00:47:30.860 |
that it may be worth training on the trajectories 00:47:35.700 |
So maybe it's a pendulum back to first party. 00:47:42.220 |
Like, I think there's like, both like, you know, 00:47:46.100 |
- I mean, I kind of want, like, let me opt out. 00:47:47.420 |
- I think that there is signals that we do get 00:47:51.020 |
Like, there's a lot of preference information 00:48:03.700 |
but also getting the preference data from our users 00:48:06.220 |
of like, hey, given these set of trajectories, 00:48:10.180 |
And in fact, one of the really beautiful parts 00:48:15.220 |
we can not only see if the acceptance happened, 00:48:17.380 |
but if something more than the acceptance happened, 00:48:31.940 |
because we're in the ultimate work output of the developer. 00:48:41.660 |
then like, yeah, you get a lot, a lot of information there. 00:48:49.140 |
- The Windsurf just gives you more of the ID. 00:48:51.460 |
So that means you can also start getting more information. 00:48:54.100 |
Like, for instance, the basic thing that Anshul said, 00:48:56.300 |
we can see if like, a file explorer was opened. 00:49:07.860 |
- Oh man, isn't that funny that we now created 00:49:12.180 |
I think that one, that one is pretty accurate. 00:49:16.260 |
I think like, we were doing that within the extensions. 00:49:20.700 |
Like, we got very, very creative with things. 00:49:22.940 |
Like, Roon mentioned the idea of just like, you know, 00:49:24.700 |
essentially rendering images to display things. 00:49:36.020 |
does make that experience as good as it possibly can there. 00:49:40.540 |
that we're able to build in like, in Windsurf. 00:49:46.380 |
'cause now we can do command in the terminal. 00:49:48.260 |
Like, you can not have to search for a batch command. 00:49:53.180 |
And like, it's like, it's not, it's not like, Cascade. 00:49:55.180 |
It's not like, a gentic system right in the lab. 00:49:56.460 |
But I'm like, that is just a very, very cool UX. 00:50:01.700 |
I've implemented a 60-line batch command called please. 00:50:11.100 |
because one of the things I think we believe in 00:50:13.420 |
is actually, I like products like autocomplete 00:50:23.140 |
to go in a different place, I actually like that too. 00:50:25.140 |
- Yeah, and I actually adopted WARP, the terminal WARP, 00:50:28.300 |
initially for that, 'cause they gave that away for free. 00:50:30.740 |
But now it's everywhere, so I can turn off a WARP 00:50:48.660 |
But they basically had this thing where you can do 00:50:50.740 |
kind of like pound, and then write in Azure language. 00:50:59.820 |
When you do the pound, it's only like it gives you 00:51:03.300 |
When you like talk to it, it generates a flow. 00:51:06.220 |
It's a bit confusing of a UX, but going back to your post, 00:51:23.580 |
oh, you have AI, that's cool, other people don't have it. 00:51:31.100 |
like the model doesn't even need to be that powerful, 00:51:33.420 |
like just having better experience is enough? 00:51:35.940 |
Or like do you think like really being able to do the whole, 00:51:41.620 |
when you generate a lot of value for the customer. 00:52:02.780 |
like from a UX perspective that make Cascade really good. 00:52:12.820 |
we're like allowing you to jump and open diffs and see it. 00:52:22.300 |
that together come to a really powerful and intuitive UX. 00:52:32.420 |
I think we're starting to see the glimpses of it. 00:52:37.580 |
First of all, it's just been really nice to work with you. 00:53:02.340 |
I don't know if this is just the nature of our company. 00:53:06.300 |
like there's all like the San Francisco AI companies 00:53:09.660 |
like on the tech and everything, which is like great. 00:53:12.100 |
We're here in Mountain View, beautiful office. 00:53:14.220 |
We just really care about like actually driving value 00:53:16.340 |
and making money, which is kind of like a core, 00:53:20.660 |
- I think maybe the selfish way of saying that, 00:53:23.260 |
or like a little more of the selfless way is like, 00:53:25.260 |
yeah, we can be kind of like this VC funded company forever, 00:53:30.700 |
if we actually want to transform the way software happens, 00:53:33.340 |
we need this part of the business that's cash regenerative 00:53:36.060 |
that enables us to actually invest tremendously 00:53:43.380 |
And we want to set ourselves up to be a company 00:53:46.180 |
that is durable and can actually solve these problems. 00:53:53.380 |
but for people who are listening to this for the first time, 00:54:08.580 |
I was, I think it was like writing part of that story, 00:54:17.060 |
So it's either building AI for the enterprise. 00:54:19.980 |
the most dangerous thing an AI startup can do 00:54:22.820 |
which I think you all, both of you will co-sign. 00:54:26.580 |
which I really liked was like, go slow to go fast. 00:54:29.100 |
Like here's the, if you actually build for like security, 00:54:33.020 |
compliance, personalization, usage analytics, 00:54:40.020 |
but eventually it's going to pay off in the long run. 00:54:46.300 |
Like if you build the easy thing first as an MVP, 00:55:10.340 |
we're going through FedRAMP accreditation right now 00:55:16.860 |
oh yeah, we already have a containerized system. 00:55:33.220 |
I think like it's just anyone who's like worked 00:55:34.780 |
like, you know, for like an extended period of time 00:55:44.660 |
you have to now like re-architect your whole system 00:55:53.380 |
they're emotionally invested in whatever it might be. 00:56:05.300 |
where we're going to see parts of our systems 00:56:06.620 |
where like, oh, we really need to re-architect that. 00:56:08.780 |
Actually, we've definitely hit that already, right? 00:56:10.660 |
And I think that's just like at like the project level, 00:56:12.940 |
the product level, or is that like your whole company, right? 00:56:17.700 |
to some degree, your company needs to have this DNA 00:56:21.180 |
And I think then you'll be able to go through those bumps 00:56:23.380 |
a lot more smoother and be able to drive the value. 00:56:33.580 |
It's like, you know, there's this constant thing 00:56:38.060 |
a lot of the time the answer should be buy, right? 00:56:39.820 |
Like we're not going to go build our own sales tool. 00:56:45.340 |
And the reason why you go with buy instead of build is, 00:56:48.060 |
hey, like, look, the ROI of what exists out there is good. 00:57:02.660 |
you're losing a core competency inside the company. 00:57:04.860 |
And that's a core competency you can never get. 00:57:09.700 |
Let me just say, like, let's say as a company, 00:57:11.900 |
we did not invest in, I don't know, model inference. 00:57:15.060 |
Yeah, we have like a custom inference runtime. 00:57:20.260 |
It's going to be very hard to get it back, right? 00:57:32.700 |
But the point is, this is more a question of like, 00:57:37.180 |
Google's a great company, makes a lot of money. 00:57:38.700 |
What happens if they actually made the search index 00:57:40.500 |
of the product something that someone else bought for them? 00:57:43.220 |
Maybe someone else could have done a good job. 00:57:46.580 |
But like, particularly because Google is a search index, 00:57:49.180 |
but like, tough luck getting that core competency back. 00:57:53.020 |
And I think for us, it's more a question of like, 00:57:54.900 |
what core competencies do we need inside the business? 00:58:00.260 |
some of these core competencies are annoying. 00:58:01.540 |
Sometimes we'll be behind, behind what exists out there. 00:58:04.540 |
Right, and we need, just need to be very honest. 00:58:05.860 |
That's where the truth-seekingness of the company matters. 00:58:08.260 |
Like, are we really honest about this core competency? 00:58:20.060 |
make our company a better company in the long-term, 00:58:26.020 |
The race is won over the next, like, five, 10 years, right? 00:58:33.700 |
I think one of the unique parts of the company is now is, 00:58:36.260 |
we have both this individual and enterprise side, 00:58:38.260 |
and usually companies stick to one or the other. 00:58:40.460 |
And I think that needs to be part of the DNA. 00:58:42.740 |
I think kind of early on in the company, as Anshul said, 00:58:45.180 |
I mean, there's stories of companies like Dropbox 00:58:48.660 |
And Dropbox is an amazing company, fantastic company, 00:58:51.220 |
that one of the fastest-growing consumer companies 00:58:53.460 |
of all time, consumer more on the software company 00:58:58.340 |
sort of product-oriented on the consumer side, 00:59:00.300 |
the enterprise is just, it's checking off a lot of boxes 00:59:03.660 |
that ultimately do not help the consumer at all, 00:59:08.220 |
And effectively, if the original group of people 00:59:10.260 |
didn't care, it's incredibly hard to get them 00:59:16.020 |
And you need to feel like, hey, this is like, 00:59:17.780 |
this is an important part for the company's viability. 00:59:20.700 |
So I think that there's a little bit of like, 00:59:22.820 |
the build versus buy part, and then also like, 00:59:27.620 |
And yeah, it's something we think about all the time. 00:59:32.940 |
I don't feel like, like, I think I know your work histories. 00:59:46.140 |
- Yeah, in fact, I think the only other sort of, 00:59:50.020 |
I guess like, when I look at my previous internships, 00:59:57.340 |
And to be honest, like, I was not that interested 01:00:01.540 |
That's not what drives me when I wake up at night. 01:00:05.900 |
in an autonomous vehicle company immediately after. 01:00:10.620 |
maybe a little bit of the unique aspect of the company, 01:00:20.740 |
There's a lot of things about being very honest 01:00:22.340 |
about what we're good at and what we're not good at. 01:00:24.220 |
Like, I think, surprisingly, enterprise sales 01:00:34.380 |
like Anshul and I helping partner with companies. 01:00:37.100 |
But very soon, we hired actually a VP of sales, 01:00:48.340 |
And I think one of the people that I think about a lot, 01:00:54.420 |
And he has figured out how to constantly change 01:01:11.420 |
about military contracts. - Yeah, now it's probably 01:01:15.020 |
Like, he's just gonna keep increasing the stakes. 01:01:17.420 |
And like, there's no playbook on how this really works. 01:01:22.340 |
solve a hard problem and work backwards from that, right? 01:01:27.260 |
Like, I don't think like, you think everything 01:01:29.220 |
from first principles to the best of our abilities, 01:01:31.540 |
but there's just so many variable unknowns that, 01:01:34.380 |
yeah, like, we don't know everything that's happening 01:01:37.940 |
and everyone knows how fast the AI space is moving. 01:01:49.020 |
we talk to pretty early-stage founders mostly. 01:01:50.940 |
They don't usually have a pretty built-out sales function. 01:01:54.060 |
Advice, what kind of sales works in this kind of field? 01:02:00.020 |
You know, anything that you can share with other founders? 01:02:04.860 |
and I really, like, Graeme Unschul can also attest, 01:02:07.420 |
like, we have amazing VP of sales at the company. 01:02:13.180 |
their job is to like, talk like, really well, 01:02:16.900 |
I mean, very obvious if you hear like, me talk, 01:02:25.900 |
So actually, just checking based on the way they speak 01:02:30.260 |
I think like, you know, what matters in a space like ours 01:02:34.900 |
I think is like, intellectual curiosity is very important. 01:02:49.100 |
you're kind of making this factory twice, thrice, 01:02:57.780 |
And you actually, the process of building a factory 01:03:06.540 |
How do you actually make sure you have hundreds of people 01:03:10.060 |
Actually, Onshore works very closely also with sales 01:03:14.540 |
Make sure that they understand the technology. 01:03:16.020 |
Our technology is also changing very quickly. 01:03:18.020 |
Let's maybe take an example on how our company 01:03:20.060 |
is very different than a company like MongoDB. 01:03:30.780 |
solve the application problem I have at hand. 01:03:32.740 |
People are curious about how our technology works. 01:03:38.180 |
And imagine we had a sales team that is scaling 01:03:42.020 |
We're not gonna be great partners to our customers. 01:03:44.100 |
So how do you create almost this growing factory 01:03:47.180 |
that is able to actually distribute the software 01:03:52.580 |
taking on all the new parts of our product, right? 01:04:09.260 |
find out what good looks like potentially in your category 01:04:22.220 |
And then there's also the sales feeding into products 01:04:25.380 |
in a way that we're talking about here, right? 01:04:27.420 |
Where they basically tell you what they need. 01:04:47.020 |
neither of us had ever done a sale for Codium in our lives 01:04:49.940 |
and we went and tried to find a sales leader, 01:04:51.500 |
we probably would have not hired the right person. 01:04:56.660 |
- We had done like hundreds and hundreds of deals cycles 01:05:07.060 |
And then I think we found like the right person, right? 01:05:22.060 |
or like engineering can't be involved, right? 01:05:25.300 |
like we hire plenty of deployed engineers, right? 01:05:28.660 |
I think like Palantir kind of made this really famous. 01:05:31.700 |
- Like deployed engineers like work very, very closely 01:05:33.980 |
with the sales team on very technical aspects 01:05:39.260 |
- As in they work at Codium as deployed engineers? 01:05:42.380 |
- And then they partner with our account executives 01:05:50.260 |
And like that's information that we keep on collating. 01:05:52.460 |
And it's like, we will both jump into any deal cycle 01:05:56.300 |
because that's how we're going to just keep on building 01:06:00.020 |
It comes back to the same, like just care, I don't know.