back to indexMastering Engineering Flow with Windsurf - Eashan Sinha, Windsurf

Chapters
0:0 Intro
0:27 Agenda
0:38 About Windsurf
1:30 Iconic Duos
2:22 Agents
3:11 How do we fix this
7:19 Flow Awareness
11:38 Engineer Maxing
00:00:00.000 |
Hey everyone, I'm Ishan, I'm an engineer here at WindSurf, and today we'll be talking a little 00:00:21.520 |
bit about how to make the most out of your development experience, how to make the most 00:00:25.360 |
out of this agenda flow that we all now have exposure to with our IDEs. 00:00:32.560 |
And also talk a little bit about how these flows make us better developers. 00:00:38.560 |
For those who are not aware, WindSurf is in the dev tool space. 00:00:42.540 |
Our flagship product is the WindSurf editor, which is, oh, as you can see, okay, sorry, 00:00:52.600 |
I can't even, will it not share if I, okay, all right. 00:00:59.140 |
Our flagship product is the WindSurf editor, which is powered by our agent, Cascade, which 00:01:05.840 |
was actually the first agentic IDE in the space. 00:01:08.420 |
After we launched a lot of tools, a lot of similar products followed suit in that they realized 00:01:13.300 |
that this agent experience was the way to go with this IDE. 00:01:19.840 |
So, for those who aren't aware or haven't tried it out, highly suggest you try it out. 00:01:25.920 |
But before we actually get into how WindSurf does agents, how we do flows, let's take a step 00:01:31.680 |
Let's talk about some of the most iconic duos to date and those that have really won, 00:01:41.880 |
What are some of the characteristics of these duos and what has really led them to be so successful? 00:01:48.040 |
It wasn't that each individual or each one of these were the best at their position or they 00:01:53.380 |
were the best by far or considered the GOAT, although some may argue otherwise for some of 00:01:59.920 |
What made them so good was that together as a team, they worked better than everybody 00:02:09.140 |
One of the most iconic duos, pineapple and pizza, regardless of what you all say. 00:02:14.920 |
This is what makes duos successful is that they work together and they know each other. 00:02:24.600 |
What this reminds me of is how we think of our agents or how we think of our coding assistants. 00:02:36.500 |
Why is it producing all this error-prone code? 00:02:40.520 |
Why is it removing code that we don't want it to remove?" 00:02:47.420 |
We tell it that we're going to fine it one million dollars if it doesn't do what we ask 00:02:51.440 |
We also tell it that it's going to go to jail if it doesn't do what we want it to do. 00:02:58.080 |
We are supposed to treat our agents like we would treat our running mate, our teammate, 00:03:14.280 |
It's ours as the builders of these tools to provide that experience for you. 00:03:18.160 |
To make sure that you as developers, you're actually able to work with these agents. 00:03:29.860 |
Before I dive in depth into that, let's take a step back and look at how we've gotten to 00:03:36.180 |
this point of agents, the evolution of AI, how AI has kind of evolved into what we see today 00:03:44.900 |
Back in 2022, the ancient times, we had to do everything ourselves, right? 00:03:58.480 |
So, in 2022 or late 2022 and moving on into '23, co-pilots were introduced. 00:04:05.980 |
I consider ChatGPT barred at the time, GitHub co-pilot as this kind of chat interface, this 00:04:12.840 |
co-pilot kind of experience where you would give a prompt or an input, you'd get a response. 00:04:19.540 |
Really good with simple Q&A and autocomplete. 00:04:22.800 |
Late 2024, this is when we saw the first agents, or what people like to call agents. 00:04:31.620 |
A lot of times, people would confuse agents with workflows, but I think that was kind of 00:04:35.660 |
cleared up more recently in that agents now are able to take this autonomy, they take this 00:04:41.040 |
independence, they operate iteratively, they adjust their trajectories, and they can perform 00:04:47.160 |
these larger scope tasks and do things that simple or single response, single shot co-pilots 00:04:57.340 |
So another way to look at this is that co-pilots were a little bit more collaborative, right? 00:05:00.560 |
We had to interact at every single step of the process while we're working with co-pilots. 00:05:04.620 |
If we ever were coding with ChatGPT, we'd have to maybe paste in every file, we would have 00:05:09.500 |
to individually send in each response, and ChatGPT would have to work on things very step-by-step, 00:05:17.720 |
So we would just send an input or a prompt, maybe add some extra context, and after the 00:05:22.160 |
LLM ran its inference, we would get the response. 00:05:25.460 |
Agents then introduced this autonomous nature where we had these models, we also allowed it 00:05:33.640 |
But on top of that, we introduced tool calling and function calling, the ability for agents 00:05:37.400 |
to perform tasks, perform these functions, and actually be able to execute things that 00:05:43.500 |
co-pilots or chatbots wouldn't necessarily be able to do. 00:05:47.660 |
What Windsurf has done when we launched was we realized that agents really didn't solve the 00:05:53.120 |
problems that co-pilots couldn't solve, and co-pilots didn't do everything that we would 00:05:59.920 |
And so we took the best of both worlds, in that we combined these models, as well as 00:06:07.680 |
retrieving proper context, figuring out how to call tools properly, in addition with understanding 00:06:14.540 |
the user, tracking the user's actions, really understanding their intent as developers, what 00:06:22.340 |
So we essentially took the best of both worlds, right? 00:06:26.120 |
And so AI Flows is what Windsurf and Cascade introduced back in November. 00:06:30.840 |
And this is where a lot of other companies, a lot of other products that I'm sure you all 00:06:34.440 |
know today, followed suit in that they realized this simple chat interface, or even having an 00:06:38.960 |
agent and a chat just maybe didn't make sense. 00:06:41.800 |
It only made sense to provide one singular agentic interface that collaborated with you as a developer. 00:06:48.060 |
And so Windsurf, our editor, it took the best of both worlds, in that it combined this collaborative 00:06:53.980 |
power of a chatbot interface with some of these autonomous and tool calling capabilities of an 00:07:00.820 |
agent, and molded them together to work in perfect unison, in perfect sync. 00:07:06.280 |
And this created a very seamless, a very unified experience where developers and AI could actually 00:07:11.800 |
operate as one, could operate as a team, rather than like these guys. 00:07:21.120 |
The first thing we wanted to consider was this concept of flow awareness. 00:07:25.460 |
What this meant was that we would have really comprehensive reasoning and understanding of the 00:07:32.280 |
This is something that a lot of agents actually don't really consider right now, and even when 00:07:37.860 |
we're building this on the side, we don't actually think of what is really important to the user. 00:07:42.160 |
So our main emphasis when we were building Cascade was understanding the user, tracking their actions, 00:07:47.000 |
their edits, their commands, the terminal commands they've run, anything that's in their clipboard, 00:07:55.840 |
All these things are tracked by Cascade, so that Cascade develops an understanding of what 00:08:00.840 |
the user has been doing, and that is all inputted into this agent's trajectory. 00:08:05.820 |
So that the agent can then outline a much more relevant set of steps that aligns with what 00:08:11.620 |
the user would be doing, or what may be doing in the future. 00:08:15.300 |
And so when you use Cascade, when you use Windsurf Tab, it almost feels like Windsurf is predicting 00:08:23.880 |
In this case, Cascade is predicting or inferring what you would do next. 00:08:27.400 |
When you're using Tab, it feels like Cascade or Windsurf is reading your mind because so 00:08:33.280 |
much of this context of the user's actions as well as this understanding of the user over 00:08:39.580 |
time as the user interacts with Cascade, this is inputted into the context window. 00:08:44.380 |
And this is what makes this agent so powerful is that it understands the user much better than 00:08:53.480 |
On top of that, we have a state-of-the-art context engine. 00:08:57.900 |
But we approach context differently than other products in this space. 00:09:00.960 |
We don't just use a RAG or embedding-based search approach. 00:09:04.520 |
We leverage a combination of multiple tools in tandem to really figure out what works best 00:09:10.180 |
so that we can understand your code base, this explicit context, as strong and comprehensively 00:09:16.480 |
And what this helps us do as an agent is it helps us get more accurate results, more relevant 00:09:21.880 |
suggestions for your code base or for any of these code suggestions. 00:09:31.380 |
Because we understand the user, because we understand your code base very well, we are able to provide 00:09:36.000 |
a much more tailored experience, an experience that actually outlines what the user may want. 00:09:41.340 |
And it's grounded in this centralized source of truth, which is this code base that you're 00:09:48.280 |
On top of that, we equip Cascade with all of the best tools for this agent to really perform 00:09:53.840 |
that multi-step iterative kind of set of tasks where it's not just limited to that single 00:10:02.840 |
It can leverage workflows, something we recently introduced, where now we are bringing together, 00:10:07.800 |
bridging that gap between the unpredictability of an agent and more of that deterministic nature 00:10:14.840 |
Where workflows, we know exactly what's going to happen and when it's going to happen. 00:10:18.340 |
And so with Windsurf, again, our main priority is to help developers as much as possible. 00:10:23.560 |
We realize that with agents, it's very unpredictable what's going to happen and what they're going 00:10:28.220 |
And so with workflows now, this bridges that gap. 00:10:30.640 |
We can actually, as users, define a set of steps for the agent to follow as they're operating. 00:10:36.840 |
On top of that, we allow Cascade to look at rules, right? 00:10:40.580 |
These are rules that you can generate as users. 00:10:44.660 |
They can be rules that you always want the model to look at. 00:10:48.560 |
You maybe sometimes want the model to look at. 00:10:51.400 |
On top of that, Cascade can generate memories of you as a user, some of your preferences, memories 00:10:56.840 |
of your code base so that it doesn't have to constantly index, doesn't have to constantly 00:11:01.840 |
And on top of that, recently, there's a lot of talk about ATA, A2AA, and multiple agents. 00:11:07.080 |
Recently we introduced multiple simultaneous Cascades working in tandem. 00:11:11.040 |
And so what this brings to us is it allows Cascade now to have these multiple different trajectories 00:11:18.740 |
that understand each other, understand the user, and understand your code base, which allows 00:11:23.920 |
us to be able to complete tasks faster, more efficiently, and get things done in the way 00:11:31.800 |
So all these tools, right, there's a lot that I left out here, like the ability to search 00:11:38.340 |
They empower Cascade to be able to really give us this agentic feeling while it's operating, 00:11:44.540 |
And really, not just, again, an independent agent, but one that understands us as users. 00:11:54.660 |
Also known as engineer maxing, how do we really max out what we get out of this experience? 00:12:00.660 |
The way I like to approach it, I'm sure you all are somewhat familiar with this kind of 00:12:04.740 |
We like to first explore, discover our code base, scope out our tasks, then plan, then build, 00:12:10.540 |
code, and then test out our changes or anything that the agent did. 00:12:15.460 |
When discovering, we can leverage Cascade at every set of these steps, right? 00:12:20.260 |
We want to leverage Cascade to understand our code base, to scope out our tasks, determine 00:12:26.880 |
It determines the definition of done and outlines a set of steps. 00:12:30.180 |
How can we tell Cascade what that definition of done is and talk through some of these goals? 00:12:35.020 |
On top of that, if we wanted to give more direct context, we cannot mention different 00:12:41.160 |
It's also very important to plan with Cascade, right? 00:12:43.500 |
A lot of people just say, hey, I'm just going to put in this prompt and let Cascade or let 00:12:49.140 |
The best way to go about this is create, work with the agent to understand you. 00:12:53.380 |
And so what that means is create a planning file, outline a set of tasks with maybe check 00:12:58.800 |
boxes to tell Cascade, hey, you need to hit these set of tasks to accomplish this definition 00:13:05.020 |
Also outline rules to tell Cascade how to behave that align with your preferences. 00:13:12.380 |
Work with Cascade to actually execute this plan that you've set for yourself. 00:13:16.580 |
Edit files one by one or multiple files at a time and ensure Cascade is working with this 00:13:21.400 |
planning document and actually checking all the boxes. 00:13:25.860 |
And you'll see that Cascade actually asks you questions. 00:13:28.380 |
It will check in with you as a user, as a developer, right? 00:13:37.460 |
Or let's look at this set of steps next, right? 00:13:41.020 |
WinServTab then provides this more hands-on experience with autocomplete. 00:13:44.820 |
And on top of that, you can leverage these simultaneous Cascades and these different MCP servers that 00:13:50.020 |
allow you to pull additional context and empower Cascade with these additional tools to complete 00:13:57.820 |
Next, lastly, then you want to generate tests with Cascade, right? 00:14:01.620 |
And generate, run, fix your tests one by one as it's iterating. 00:14:06.180 |
You can leverage workflows here, which you can leverage to automate some of these tasks. 00:14:12.620 |
And at the end, determine if Cascade hit that DOD, that definition of done. 00:14:16.620 |
If it didn't, make sure Cascade understands what it did wrong, because Cascade is building this learning of you. 00:14:22.420 |
It's building this embedding representation of you as a developer, and it's learning from that. 00:14:27.220 |
So in the future, it knows where to not go wrong and where to improve. 00:14:31.420 |
And then lastly, then you want to commit to git, and make sure that you're pushing code that is functional. 00:14:38.820 |
So you and Cascade, right, don't think of you guys as separate entities. 00:14:43.620 |
We want to think of you all as a merge, as a team, right? 00:14:46.220 |
As a peer programmer, you guys are working together to accomplish tasks. 00:14:50.420 |
And so this is how you and Cascade should be. 00:14:52.820 |
This is how developers and Cascade should be. 00:14:54.620 |
It shouldn't be separate where you tell Cascade, "Hey, just accomplish this set of tasks. 00:14:58.420 |
I'll come back in 30 minutes, and we'll see what you did," right? 00:15:02.920 |
We're not at that level where LLMs can do that just yet. 00:15:08.920 |
And that is how you'll get the best development experience out of these agents.