Claude Code & the evolution of agentic coding

. Hello. This is awesome. This is a big crowd. Who here has used quad code before? Jesus. Awesome. That's what I would like to see. Cool. So my name is Boris. I'm a member of technical staff at Anthropic and creator of QuadCode. I was struggling with what to talk about for an audience that already knows QuadCode, already knows AI, and all the coding tools, and agentic coding, and stuff like that.

So I'm going to zoom out a little bit, and then we'll zoom back in. So here's my tilde R. The model is moving really fast. It's on exponential. It's getting better at coding very, very quickly, as everyone that uses the model knows. And the product is kind of struggling to keep up.

We're trying to figure out what product to build that's good enough for a model like this. And we feel like there's so many more products that could be built for models that are this good at coding. And we're kind of building the bare minimum. And I'll kind of talk about why.

And with QuadCode, we're trying to stay unopinionated about what the product should look like, because we don't know. And so for everyone that didn't raise your hand, I think that's like 10 of you, this is how you get QuadCode. You can head to the quad.ai/code to install it. You can run this incantation to install from NPM.

As of yesterday, we support QuadPro plans, so you can try it on that. We support QuadMax, so yeah, just try it out. Tell us what you think. So programming is changing, and it's changing faster and faster. And if you look at where programming started back in the 1930s, '40s, there was like switchboards, and it was this physical thing.

There was no such thing as software. And then sometime in the 1950s, punch cards became a thing. And my grandpa, actually, in the Soviet Union, he was one of the first programmers in the Soviet Union. And my mom would tell me stories about when she grew up in the 1970s or whatever, she would bring these big stacks of punch cards home from work, and she would draw all over them with crayons.

And that was growing up for her. And that's what programming was back in the 1950s, '60s, '70s, even. But sometime in the late '50s, we started to see these higher level languages emerge. So first there was assembly, so programming moves from hardware to punch cards, which is still physical, to software.

And then the level of abstraction just went up. So we got to COBOL, then we got to typed languages, we got to C++. In the early '90s, there was this explosion of these new language families. There was the Haskell family, and JavaScript and Java, the evolution of the C family, and then Python.

And I think nowadays, if you kind of squint, all the languages sort of look the same. Like, when I write TypeScript, it kind of feels like writing Rust, and that kind of feels like writing Swift, and that kind of feels like writing Go. The abstractions have started to converge a bit.

If we think about the UX of programming languages, this has also evolved. Back in the 1950s, you used something like a typewriter to punch holes in punch cards. And that was programming back in the day. And at some point, text editors appeared. And then Pascal and all these different IDs appeared that let you interact with your programs and your software in new ways.

And each one kind of brought something. And I feel like programming languages have sort of leveled out, but the model is on an exponential. And the UX of programming is also on an exponential. And I'll talk a little bit more about that. Does anyone know what was the first text editor?

OK. I heard Ed from someone. I think he read the screen. Well, before text editors, this is what programming looked like. So this was the IBM 029. This was kind of a top of the line. This was like the MacBook of the time for programming punch cards. Everyone have this.

You can still find it in museums somewhere. And yeah, this is Ed. This is the first text editor. This was Kem Thompson at the labs invented this. And it kind of looks familiar. If you open your MacBook, you can actually still type Ed. This is still distributed on Unix as part of Unix systems.

And this is crazy, because this thing was invented 50 years ago. And this is nuts. Like, there's no cursor. There's no scroll back. There's no fancy commands. There's no type ahead. There's pretty much nothing. This is the simple text editor of the time. And it was built for teletype machines, which were literally physical machines that printed on paper.

That's how your program was printed. And this is the first software manifestation of a UX for programming software. So it was really built for these machines that didn't support scroll back and cursors or anything like that. For all the Vim fans, I'm going to jump ahead of Vim. Vim was a big innovation.

Emacs was a big innovation around the same time. I think in 1980, Smalltalk 80 was a big jump forward. This is one of the first, I think the first, graphical interface for programming software. And for anyone that's tried to set up, like, live reload with React or Redux or any of this stuff, this thing had live reload in 1980.

And it worked. And we were still kind of struggling to get that to work with, like, React JS nowadays. This was a big jump forward. And obviously, like, the language, it had object-oriented programming and a bunch of new concepts. But on the UI side, there's a lot of new things, too.

In '91, I think Visual Basic was the first code editor that introduced a graphical paradigm to the mainstream. So before, people were using text-based editors, Vim and things like that were still very popular, despite things like Smalltalk. So this kind of brought it mainstream. This is what I grew up with.

Eclipse brought Type Ahead to the mainstream. This isn't using AI Type Ahead. This is not Cursor or Windsurf. This is just using static analysis. So it's indexing your symbols. And then it can rank the symbols and re-rank them. And it knows what symbols to show. I think this was also the first big third-party ecosystem for IDs.

Copilot was a big jump forward with single-line Type Ahead and then multi-line Type Ahead. And I think Devon was probably the first IDE that introduced this next concept and this next abstraction to the world, which is to program you don't have to write code, you can write natural language, and that becomes code.

And this is something people have been trying to figure out for decades. I think Devon is the first product that broke through and took this mainstream. And the UX has evolved quickly. But I think it's about to get even faster. We talked about UX and we talked about programming languages and verification is a part of this too.

So verification has started with manual debugging and physically inspecting outputs. And now there's a lot of probabilistic verification like fuzzing and vulnerability testing. And like Netflix's chaos testing and things like that. And so with all this in mind, Cloud Code's approach is a little different. It's to start with the terminal and to give you as low-level access to the model as possible in a way that you can still be productive.

So we want the model to be useful for you. We also want to get -- we want to be unopinionated and we want to get out of the way. So we don't give you a bunch of flashy UI. We don't try to put a bunch of scaffolding in the way.

Some of this is we're a model company at Anthropic and, you know, we make models and we want people to experience those models. But I think another part is we actually just don't know. Like we don't know what the right UX is. So we're starting simple. And so Cloud Code, it's intentionally simple.

It's intentionally general. It shows off the model in the ways that matter to us, which is they can use all your tools and they can fit into all your workflows. So you can figure out how to use the model in this world where the UX of using code and using models is changing so fast.

And so this is my second point. The model just keeps getting better. And this is the better lesson. I have it -- I have this, like, framed and taped to the side of my wall. Because the more general model always wins. And the model increases in capability exponentially. And there are many corollaries to this.

Everything around the model is also increasing exponentially. And the more general thing even around the model usually wins. So with Cloud Code, there's one product. And there's a lot of ways to use it. So there's a terminal product. And, you know, this is the thing everyone knows. So you can install Cloud Code and then you just run Cloud in any terminal.

We're unopinionated, so it works in iterm2. It works in WSL. It works over SSH and TMUX sessions. It works in your VS Code terminal and your cursor terminal. This works anywhere, in any terminal. When you run Cloud Code in the IDE, we do a little bit more. So we kind of take over the IDE a little bit.

And, you know, diffs, instead of being in line in the terminal, they're going to be big and beautiful and show up in the IDE itself. And we also ingest diagnostics. So we kind of try to take advantage of that. And you'll notice this isn't as polished as something like, again, like Hercer, Windsurf.

These are awesome products. And I use these every day. This is to let you experience the model in a low-level, raw way. And this is sort of the minimal that we had to do to let you experience them. We announced a couple weeks ago that you can now use Cloud on GitHub.

Can I get a show of hands who's tried this already? So for everyone that hasn't tried this, all you have to do is open up Cloud. You run this one slash command, install GitHub app. You pick the repo, and then you can run Cloud in any repo. This is running on your compute.

Your data stays on your compute. It does not go to us. So it's kind of a nice experience. And it lets you use your existing stack. You don't have to change stuff around. It takes a few minutes to set up. And again, here we intentionally built something really simple because we don't know what the UX is yet.

And this is the minimal possible thing that helps us learn, but also is useful for engineers to do day-to-day work. Like, I use this every day. The extreme version of this is our SDK. And this is something that you can use to build on Cloud code without -- if you don't want to use, like, you know, the terminal app or the IDE integration or GitHub, you can just roll your own integration.

You can build it however you want. People have built all sorts of UIs, all sorts of awesome integrations. And all this is is you run Cloud-P, and you can use it programmatically. And so, like, something I use it for, for example, is, for instance, in Triage, I'll take my GitHub logs -- sorry, my GCP logs.

I'll pipe it into Cloud-P because it's, like, it's a Unix utility. So you can pipe in, you can pipe out. And then I'll, like, JQ the result. So it's kind of cool. Like, this is a new way to use models. This is maybe 10% export. No one has really figured out how to use models as a Unix utility.

This is another aspect of code as UX that we just don't know yet. And so, again, we just built the simplest possible thing so we can learn and so people can try it out and see what works for you. Okay, I wanted to give a few tips for how to use quad code.

This is a talk about quad code, so this is kind of zooming back in. And this is actually true for, I think, a lot of coding agents, but this is kind of custom to the way that I personally use quad code. So the simplest way to use this -- it seems like most of this room is very familiar with quad code and similar coding agents.

But the simplest way to introduce new people that have not used this kind of tool before is do code-based Q&A. And so at Anthropic, we teach quad code to every engineer on day one, and it's shortened onboarding times from, like, two or three weeks to, like, two days maybe.

And also, I don't get bugged about questions anymore. You can just ask quad, and honestly, like, I'll just ask quad, too. And then this is something that I do pretty much every day on Monday. We have a standup every week. I'll just ask quad what did I ship that week.

It'll look through my git commits, and it'll tell me so I don't have to keep track. The second thing is teach quad how to use your tools. And this is something that has not really existed before when you think about the UX of programming. With every IDE, there's sort of like a plug-in ecosystem.

You know, for Emacs, there's this kind of Lispy dialect that you use to make plug-ins. If you use Eclipse or VS Code, you have to make plug-ins. For this new kind of coding tool, it can just use all your tools. So you give it bash tools, you give it MCP tools.

Something I'll often say is here's the CLI tool, Claude, run dash dash help, take what you learn, and then put it in the Claude MD. And now Claude knows how to use the tool. That's all it takes. You don't have to build a bridge. You don't have to build an extension.

There's nothing fancy like that. Of course, if you have groups of tools, or if you have fancier functionality like streaming and things like this, you can just use MCP as well. Traditional coding tools focused a lot on actually writing the code, and I think the new kinds of coding tools, they do a lot more than that.

And I think this is a lot of where people that are new to these tools struggle to figure out how to use them. So there's a few workflows that I've discovered for using Cloud Code most effectively for myself. The first one is have Claude code explore and make a plan and run it by me before it writes code.

You can also ask it to use thinking. So typically we see extended thinking work really well if Claude already has something in context. So have it use tools, have it pull things into context, and then think. If it's thinking up front, you're probably just kind of wasting tokens. And it's not going to be that useful.

But if there's a lot of context, it does help a bunch. The second one is TDD. I know I try to use TDD, it's like, it's pretty hard to use in practice. But I think now with coding tools, it actually works really well. And maybe the reason is it's not me doing it, it's the model doing it.

And so the workflow here is tell Claude to write some tests and kind of describe it. And just make it really clear, like the tests aren't going to pass yet. Don't try to run the test because it's going to try to run the test. Tell it, like, you know, it's not going to pass.

Write the test first, commit, and then write the code, and then commit. And it's kind of a general case of if Claude has a target to iterate against, it can do much better. So if there's some way to verify the output, like a unit test, integration test, a way to screenshot in your iOS simulator, a way to screenshot in Puppeteer, just some way to see its output.

We actually did this for robots. Like, we taught Claude how to use a 3D printer, and then it has a little camera to see the output. If it can see the output and you let it iterate, the result will be much better than if it couldn't iterate. The first shot will be all right, but the second or third shot will be pretty good.

So give it some kind of target to iterate against. Today, we launched plan mode in Claude code, and this is a way to do the first kind of workflow more easily. So anytime, hit Shift-Tab, and Claude will switch to plan mode. So you can ask it to do something, but it won't actually do that yet.

It'll just make a plan, and it'll wait for approval. So restart Claude to get the update, run Shift-Tab. Okay, and then the final tip is give Claude more context. There's a bunch of ways to do this. Claude-MD is the easiest way. So take this file called Claude-MD, put it in the root of your repo.

You can also put in subfolders. Those will get pulled in on-demand. You can put in your home folder. This will get pulled in as well. And then you can also use Flash Commands. So if you put files, like just regular markdown files in these special folders, .claude/commands, it'll be available under the Flash Menu.

So pretty cool. This is useful for reusable workflows. And then to add stuff to Claude-MD, you can always type the pound sign to ask Claude to memorize something, and it'll prompt you which memory this should be added to. And you can see this is us trying to figure out how to use memory, how to use this new concept that is new to coding models, did not exist in previous IDEs, how to make the UX of this work.

And you can tell this is still pretty rough. This is our first version, but it's the first version that works. And so we're going to be iterating on this, and we really want to hear feedback about what works about this UX and what doesn't. Thanks. Thank you, Boris. Unfortunately, we only have one minute left, so someone sent a question on Slack.

The question is, as I delegate more and more to Claude code, as it runs for 10 minutes and I have 10 of these active, how do I use the tool? You got 50 seconds. Yeah. It's pretty cool. I think this is something that we actually see in a lot of our power users, that they tend to multi-Claude.

You don't just have a single Claude open, but you have a couple terminal tabs, either with a few checkouts of Claude or of your code base, or it's the same code base but with different work trees. And you have Claude doing stuff in parallel. This is also a lot easier with GitHub Actions, because you can just spawn a bunch of actions, and get Claude to do a bunch of stuff.

Typically, we don't need to coordinate between these quads, I think, for most use cases. If you do want to coordinate, the best way is just ask them to write to a Markdown file, and that's it. Awesome. Yeah. Both things work. Thank you so much. And once again, give it up for Boris from Anthropic.

Thank you so much. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. We'll see you next time.

Claude Code & the evolution of agentic coding — Boris Cherny, Anthropic

Transcript