Mastering Engineering Flow with Windsurf

Hey everyone, I'm Ishan, I'm an engineer here at WindSurf, and today we'll be talking a little bit about how to make the most out of your development experience, how to make the most out of this agenda flow that we all now have exposure to with our IDEs. And also talk a little bit about how these flows make us better developers.

For those who are not aware, WindSurf is in the dev tool space. Our flagship product is the WindSurf editor, which is, oh, as you can see, okay, sorry, I can't even, will it not share if I, okay, all right. Our flagship product is the WindSurf editor, which is powered by our agent, Cascade, which was actually the first agentic IDE in the space.

After we launched a lot of tools, a lot of similar products followed suit in that they realized that this agent experience was the way to go with this IDE. So, for those who aren't aware or haven't tried it out, highly suggest you try it out. But before we actually get into how WindSurf does agents, how we do flows, let's take a step back.

Let's talk about some of the most iconic duos to date and those that have really won, that have been successful. What are some of the characteristics of these duos and what has really led them to be so successful? It wasn't that each individual or each one of these were the best at their position or they were the best by far or considered the GOAT, although some may argue otherwise for some of these people.

What made them so good was that together as a team, they worked better than everybody else and that's what led to success. One of the most iconic duos, pineapple and pizza, regardless of what you all say. This is what makes duos successful is that they work together and they know each other.

They complement each other very well. Another one of the most iconic duos. What this reminds me of is how we think of our agents or how we think of our coding assistants. A lot of times we fight with them. We ask ourselves, "Why is it not getting us? Why is it producing all this error-prone code?

Why is it duplicating code? Why is it removing code that we don't want it to remove?" We also tend to abuse our agents sometimes. We tell it that we're going to fine it one million dollars if it doesn't do what we ask it to do. We also tell it that it's going to go to jail if it doesn't do what we want it to do.

But that's not how it's supposed to be. We are supposed to treat our agents like we would treat our running mate, our teammate, our friend. And this is what it's supposed to look like. Right? Not like what Tom and Jerry are. So, it's not your fault as developers. It's ours.

It's ours as the builders of these tools to provide that experience for you. To make sure that you as developers, you're actually able to work with these agents. that these agents understand you. How do we fix this problem? How does Windsurf approach this? Before I dive in depth into that, let's take a step back and look at how we've gotten to this point of agents, the evolution of AI, how AI has kind of evolved into what we see today with these coding agents.

Back in 2022, the ancient times, we had to do everything ourselves, right? It's kind of terrifying. We had to write all the code by ourselves. There was no AI to really help us, right? There was just Stack Overflow and Google. So, in 2022 or late 2022 and moving on into '23, co-pilots were introduced.

I consider ChatGPT barred at the time, GitHub co-pilot as this kind of chat interface, this co-pilot kind of experience where you would give a prompt or an input, you'd get a response. Really good with simple Q&A and autocomplete. Late 2024, this is when we saw the first agents, or what people like to call agents.

A lot of times, people would confuse agents with workflows, but I think that was kind of cleared up more recently in that agents now are able to take this autonomy, they take this independence, they operate iteratively, they adjust their trajectories, and they can perform these larger scope tasks and do things that simple or single response, single shot co-pilots would not be able to do.

So another way to look at this is that co-pilots were a little bit more collaborative, right? We had to interact at every single step of the process while we're working with co-pilots. If we ever were coding with ChatGPT, we'd have to maybe paste in every file, we would have to individually send in each response, and ChatGPT would have to work on things very step-by-step, right?

So we would just send an input or a prompt, maybe add some extra context, and after the LLM ran its inference, we would get the response. Agents then introduced this autonomous nature where we had these models, we also allowed it to retrieve this context. But on top of that, we introduced tool calling and function calling, the ability for agents to perform tasks, perform these functions, and actually be able to execute things that co-pilots or chatbots wouldn't necessarily be able to do.

What Windsurf has done when we launched was we realized that agents really didn't solve the problems that co-pilots couldn't solve, and co-pilots didn't do everything that we would want to do as developers as well. And so we took the best of both worlds, in that we combined these models, as well as retrieving proper context, figuring out how to call tools properly, in addition with understanding the user, tracking the user's actions, really understanding their intent as developers, what they're going to do.

So we essentially took the best of both worlds, right? And so AI Flows is what Windsurf and Cascade introduced back in November. And this is where a lot of other companies, a lot of other products that I'm sure you all know today, followed suit in that they realized this simple chat interface, or even having an agent and a chat just maybe didn't make sense.

It only made sense to provide one singular agentic interface that collaborated with you as a developer. And so Windsurf, our editor, it took the best of both worlds, in that it combined this collaborative power of a chatbot interface with some of these autonomous and tool calling capabilities of an agent, and molded them together to work in perfect unison, in perfect sync.

And this created a very seamless, a very unified experience where developers and AI could actually operate as one, could operate as a team, rather than like these guys. So how did Windsurf do this? The first thing we wanted to consider was this concept of flow awareness. What this meant was that we would have really comprehensive reasoning and understanding of the implicit user intent.

This is something that a lot of agents actually don't really consider right now, and even when we're building this on the side, we don't actually think of what is really important to the user. So our main emphasis when we were building Cascade was understanding the user, tracking their actions, their edits, their commands, the terminal commands they've run, anything that's in their clipboard, all the files that they've recently edited.

All these things are tracked by Cascade, so that Cascade develops an understanding of what the user has been doing, and that is all inputted into this agent's trajectory. So that the agent can then outline a much more relevant set of steps that aligns with what the user would be doing, or what may be doing in the future.

And so when you use Cascade, when you use Windsurf Tab, it almost feels like Windsurf is predicting your next step. It's almost like an LLM. It predicts the next token. In this case, Cascade is predicting or inferring what you would do next. When you're using Tab, it feels like Cascade or Windsurf is reading your mind because so much of this context of the user's actions as well as this understanding of the user over time as the user interacts with Cascade, this is inputted into the context window.

And this is what makes this agent so powerful is that it understands the user much better than a very general purpose or independent agent. On top of that, we have a state-of-the-art context engine. We have a talk on this tomorrow as well. But we approach context differently than other products in this space.

We don't just use a RAG or embedding-based search approach. We leverage a combination of multiple tools in tandem to really figure out what works best so that we can understand your code base, this explicit context, as strong and comprehensively as possible. And what this helps us do as an agent is it helps us get more accurate results, more relevant suggestions for your code base or for any of these code suggestions.

We have reduced hallucinations. We don't have to do any guesswork. Because we understand the user, because we understand your code base very well, we are able to provide a much more tailored experience, an experience that actually outlines what the user may want. And it's grounded in this centralized source of truth, which is this code base that you're working in.

On top of that, we equip Cascade with all of the best tools for this agent to really perform that multi-step iterative kind of set of tasks where it's not just limited to that single response nature. It can actually call MCP servers. It can leverage workflows, something we recently introduced, where now we are bringing together, bridging that gap between the unpredictability of an agent and more of that deterministic nature of a workflow.

Workflow, right? Where workflows, we know exactly what's going to happen and when it's going to happen. And so with Windsurf, again, our main priority is to help developers as much as possible. We realize that with agents, it's very unpredictable what's going to happen and what they're going to do.

And so with workflows now, this bridges that gap. We can actually, as users, define a set of steps for the agent to follow as they're operating. On top of that, we allow Cascade to look at rules, right? These are rules that you can generate as users. They could be file-based rules.

They can be rules that you always want the model to look at. You maybe sometimes want the model to look at. On top of that, Cascade can generate memories of you as a user, some of your preferences, memories of your code base so that it doesn't have to constantly index, doesn't have to constantly retrieve your code base.

And on top of that, recently, there's a lot of talk about ATA, A2AA, and multiple agents. Recently we introduced multiple simultaneous Cascades working in tandem. And so what this brings to us is it allows Cascade now to have these multiple different trajectories that understand each other, understand the user, and understand your code base, which allows us to be able to complete tasks faster, more efficiently, and get things done in the way that we want to get them done.

So all these tools, right, there's a lot that I left out here, like the ability to search the web and other things like that. They empower Cascade to be able to really give us this agentic feeling while it's operating, right? And really, not just, again, an independent agent, but one that understands us as users.

But as engineers, how do we make the most? How do we get the most out of these flows? Also known as engineer maxing, how do we really max out what we get out of this experience? The way I like to approach it, I'm sure you all are somewhat familiar with this kind of set of steps.

We like to first explore, discover our code base, scope out our tasks, then plan, then build, code, and then test out our changes or anything that the agent did. When discovering, we can leverage Cascade at every set of these steps, right? We want to leverage Cascade to understand our code base, to scope out our tasks, determine this definition of done, right?

That's what an agent does, right? It determines the definition of done and outlines a set of steps. How can we tell Cascade what that definition of done is and talk through some of these goals? On top of that, if we wanted to give more direct context, we cannot mention different files and directories.

It's also very important to plan with Cascade, right? A lot of people just say, hey, I'm just going to put in this prompt and let Cascade or let this agent just build everything for me. The best way to go about this is create, work with the agent to understand you.

And so what that means is create a planning file, outline a set of tasks with maybe check boxes to tell Cascade, hey, you need to hit these set of tasks to accomplish this definition of done. Also outline rules to tell Cascade how to behave that align with your preferences.

And then we go into building. Work with Cascade to actually execute this plan that you've set for yourself. Edit files one by one or multiple files at a time and ensure Cascade is working with this planning document and actually checking all the boxes. And you'll see that Cascade actually asks you questions.

It will check in with you as a user, as a developer, right? And you make sure to answer it very clearly. Tell it, hey, you're doing this incorrectly. Or, hey, let's actually look at this first. Or let's look at this set of steps next, right? WinServTab then provides this more hands-on experience with autocomplete.

And on top of that, you can leverage these simultaneous Cascades and these different MCP servers that allow you to pull additional context and empower Cascade with these additional tools to complete this agentic experience. Next, lastly, then you want to generate tests with Cascade, right? And generate, run, fix your tests one by one as it's iterating.

You can leverage workflows here, which you can leverage to automate some of these tasks. And at the end, determine if Cascade hit that DOD, that definition of done. If it didn't, make sure Cascade understands what it did wrong, because Cascade is building this learning of you. It's building this embedding representation of you as a developer, and it's learning from that.

So in the future, it knows where to not go wrong and where to improve. And then lastly, then you want to commit to git, and make sure that you're pushing code that is functional. So you and Cascade, right, don't think of you guys as separate entities. We want to think of you all as a merge, as a team, right?

As a peer programmer, you guys are working together to accomplish tasks. And so this is how you and Cascade should be. This is how developers and Cascade should be. It shouldn't be separate where you tell Cascade, "Hey, just accomplish this set of tasks. I'll come back in 30 minutes, and we'll see what you did," right?

We're not at that level where LLMs can do that just yet. And so really work with Cascade. Make it your friend. And that is how you'll get the best development experience out of these agents. Thank you. We'll see you next time. We'll see you next time. We'll see you next time.

Mastering Engineering Flow with Windsurf - Eashan Sinha, Windsurf

Chapters

Transcript