back to indexClaude Code & the evolution of agentic coding — Boris Cherny, Anthropic

00:00:35.180 |
I was struggling with what to talk about for an audience that 00:00:47.020 |
already knows QuadCode, already knows AI, and all the coding 00:00:50.000 |
tools, and agentic coding, and stuff like that. 00:00:56.680 |
So here's my tilde R. The model is moving really fast. 00:01:02.220 |
It's getting better at coding very, very quickly, 00:01:07.520 |
And the product is kind of struggling to keep up. 00:01:09.880 |
We're trying to figure out what product to build that's 00:01:13.600 |
And we feel like there's so many more products that could be 00:01:15.800 |
built for models that are this good at coding. 00:01:22.640 |
And with QuadCode, we're trying to stay unopinionated about what 00:01:26.840 |
the product should look like, because we don't know. 00:01:32.800 |
And so for everyone that didn't raise your hand, I think that's 00:01:35.260 |
like 10 of you, this is how you get QuadCode. 00:01:39.020 |
You can head to the quad.ai/code to install it. 00:01:42.320 |
You can run this incantation to install from NPM. 00:01:45.860 |
As of yesterday, we support QuadPro plans, so you can try it on that. 00:01:50.060 |
We support QuadMax, so yeah, just try it out. 00:02:01.460 |
And if you look at where programming started back in the 1930s, 00:02:06.020 |
'40s, there was like switchboards, and it was this physical thing. 00:02:11.060 |
And then sometime in the 1950s, punch cards became a thing. 00:02:15.300 |
And my grandpa, actually, in the Soviet Union, 00:02:18.420 |
he was one of the first programmers in the Soviet Union. 00:02:21.480 |
And my mom would tell me stories about when she grew up 00:02:23.820 |
in the 1970s or whatever, she would bring these big stacks 00:02:26.720 |
of punch cards home from work, and she would draw all over them 00:02:33.940 |
And that's what programming was back in the 1950s, '60s, '70s, 00:02:43.760 |
So first there was assembly, so programming moves 00:02:50.980 |
And then the level of abstraction just went up. 00:02:53.360 |
So we got to COBOL, then we got to typed languages, 00:03:01.280 |
There was the Haskell family, and JavaScript and Java, 00:03:05.320 |
the evolution of the C family, and then Python. 00:03:12.600 |
Like, when I write TypeScript, it kind of feels 00:03:14.220 |
like writing Rust, and that kind of feels like writing Swift, 00:03:18.740 |
The abstractions have started to converge a bit. 00:03:20.320 |
If we think about the UX of programming languages, 00:03:29.900 |
Back in the 1950s, you used something like a typewriter 00:03:40.980 |
And then Pascal and all these different IDs appeared 00:03:46.320 |
that let you interact with your programs and your software 00:03:51.720 |
And I feel like programming languages have sort of leveled 00:03:56.100 |
And the UX of programming is also on an exponential. 00:04:02.720 |
Does anyone know what was the first text editor? 00:04:14.440 |
Well, before text editors, this is what programming looked like. 00:04:20.040 |
This was like the MacBook of the time for programming punch cards. 00:04:31.960 |
This was Kem Thompson at the labs invented this. 00:04:37.240 |
If you open your MacBook, you can actually still type Ed. 00:04:39.360 |
This is still distributed on Unix as part of Unix systems. 00:04:43.660 |
And this is crazy, because this thing was invented 50 years ago. 00:04:56.280 |
And it was built for teletype machines, which were literally 00:05:03.420 |
And this is the first software manifestation of a UX for programming 00:05:07.400 |
So it was really built for these machines that didn't support scroll back and cursors or anything 00:05:15.620 |
For all the Vim fans, I'm going to jump ahead of Vim. 00:05:18.900 |
Emacs was a big innovation around the same time. 00:05:21.360 |
I think in 1980, Smalltalk 80 was a big jump forward. 00:05:26.040 |
This is one of the first, I think the first, graphical interface for programming software. 00:05:32.000 |
And for anyone that's tried to set up, like, live reload with React or Redux or any of this 00:05:43.540 |
And we were still kind of struggling to get that to work with, like, React JS nowadays. 00:05:50.760 |
And obviously, like, the language, it had object-oriented programming and a bunch of new concepts. 00:05:54.560 |
But on the UI side, there's a lot of new things, too. 00:05:58.480 |
In '91, I think Visual Basic was the first code editor that introduced a graphical paradigm 00:06:05.740 |
So before, people were using text-based editors, Vim and things like that were still very popular, 00:06:15.680 |
Eclipse brought Type Ahead to the mainstream. 00:06:24.040 |
And then it can rank the symbols and re-rank them. 00:06:27.860 |
I think this was also the first big third-party ecosystem for IDs. 00:06:30.720 |
Copilot was a big jump forward with single-line Type Ahead and then multi-line Type Ahead. 00:06:41.160 |
And I think Devon was probably the first IDE that introduced this next concept and this next abstraction 00:06:47.960 |
to the world, which is to program you don't have to write code, you can write natural language, and that becomes code. 00:06:54.760 |
And this is something people have been trying to figure out for decades. 00:06:57.260 |
I think Devon is the first product that broke through and took this mainstream. 00:07:12.000 |
We talked about UX and we talked about programming languages and verification is a part of this too. 00:07:18.000 |
So verification has started with manual debugging and physically inspecting outputs. 00:07:22.000 |
And now there's a lot of probabilistic verification like fuzzing and vulnerability testing. 00:07:28.800 |
And like Netflix's chaos testing and things like that. 00:07:34.800 |
And so with all this in mind, Cloud Code's approach is a little different. 00:07:38.800 |
It's to start with the terminal and to give you as low-level access to the model as possible in a way that you can still be productive. 00:07:48.800 |
We also want to get -- we want to be unopinionated and we want to get out of the way. 00:07:54.600 |
We don't try to put a bunch of scaffolding in the way. 00:07:57.600 |
Some of this is we're a model company at Anthropic and, you know, we make models and we want people to experience those models. 00:08:03.600 |
But I think another part is we actually just don't know. 00:08:09.600 |
And so Cloud Code, it's intentionally simple. 00:08:14.600 |
It shows off the model in the ways that matter to us, which is they can use all your tools and they can fit into all your workflows. 00:08:21.400 |
So you can figure out how to use the model in this world where the UX of using code and using models is changing so fast. 00:08:37.200 |
I have it -- I have this, like, framed and taped to the side of my wall. 00:08:45.200 |
And the model increases in capability exponentially. 00:08:50.200 |
Everything around the model is also increasing exponentially. 00:08:53.200 |
And the more general thing even around the model usually wins. 00:09:05.000 |
And, you know, this is the thing everyone knows. 00:09:07.000 |
So you can install Cloud Code and then you just run Cloud in any terminal. 00:09:18.800 |
It works in your VS Code terminal and your cursor terminal. 00:09:23.800 |
When you run Cloud Code in the IDE, we do a little bit more. 00:09:30.800 |
So we kind of take over the IDE a little bit. 00:09:33.800 |
And, you know, diffs, instead of being in line in the terminal, 00:09:36.800 |
they're going to be big and beautiful and show up in the IDE itself. 00:09:46.400 |
And you'll notice this isn't as polished as something like, again, 00:09:54.600 |
This is to let you experience the model in a low-level, raw way. 00:09:58.600 |
And this is sort of the minimal that we had to do to let you experience them. 00:10:01.600 |
We announced a couple weeks ago that you can now use Cloud on GitHub. 00:10:09.600 |
Can I get a show of hands who's tried this already? 00:10:12.400 |
So for everyone that hasn't tried this, all you have to do is open up Cloud. 00:10:18.200 |
You run this one slash command, install GitHub app. 00:10:20.200 |
You pick the repo, and then you can run Cloud in any repo. 00:10:37.800 |
And again, here we intentionally built something really simple because we don't know what the 00:10:42.800 |
And this is the minimal possible thing that helps us learn, but also is useful for engineers 00:10:56.600 |
And this is something that you can use to build on Cloud code without -- if you don't want 00:11:02.600 |
to use, like, you know, the terminal app or the IDE integration or GitHub, you can just 00:11:08.600 |
People have built all sorts of UIs, all sorts of awesome integrations. 00:11:12.400 |
And all this is is you run Cloud-P, and you can use it programmatically. 00:11:18.500 |
And so, like, something I use it for, for example, is, for instance, in Triage, I'll take my GitHub 00:11:25.800 |
I'll pipe it into Cloud-P because it's, like, it's a Unix utility. 00:11:37.500 |
No one has really figured out how to use models as a Unix utility. 00:11:41.900 |
This is another aspect of code as UX that we just don't know yet. 00:11:47.000 |
And so, again, we just built the simplest possible thing so we can learn and so people can try 00:11:55.500 |
Okay, I wanted to give a few tips for how to use quad code. 00:11:58.200 |
This is a talk about quad code, so this is kind of zooming back in. 00:12:03.100 |
And this is actually true for, I think, a lot of coding agents, but this is kind of 00:12:06.400 |
custom to the way that I personally use quad code. 00:12:08.800 |
So the simplest way to use this -- it seems like most of this room is very familiar with 00:12:15.900 |
But the simplest way to introduce new people that have not used this kind of tool before is 00:12:21.800 |
And so at Anthropic, we teach quad code to every engineer on day one, and it's shortened onboarding 00:12:27.900 |
times from, like, two or three weeks to, like, two days maybe. 00:12:29.900 |
And also, I don't get bugged about questions anymore. 00:12:33.300 |
You can just ask quad, and honestly, like, I'll just ask quad, too. 00:12:37.600 |
And then this is something that I do pretty much every day on Monday. 00:12:43.300 |
I'll just ask quad what did I ship that week. 00:12:44.300 |
It'll look through my git commits, and it'll tell me so I don't have to keep track. 00:12:51.700 |
The second thing is teach quad how to use your tools. 00:12:53.700 |
And this is something that has not really existed before when you think about the UX of programming. 00:12:58.700 |
With every IDE, there's sort of like a plug-in ecosystem. 00:13:01.700 |
You know, for Emacs, there's this kind of Lispy dialect that you use to make plug-ins. 00:13:06.500 |
If you use Eclipse or VS Code, you have to make plug-ins. 00:13:08.800 |
For this new kind of coding tool, it can just use all your tools. 00:13:11.600 |
So you give it bash tools, you give it MCP tools. 00:13:14.500 |
Something I'll often say is here's the CLI tool, Claude, run dash dash help, take what you learn, 00:13:30.700 |
Of course, if you have groups of tools, or if you have fancier functionality like streaming 00:13:34.000 |
and things like this, you can just use MCP as well. 00:13:40.300 |
Traditional coding tools focused a lot on actually writing the code, and I think the new kinds 00:13:46.200 |
of coding tools, they do a lot more than that. 00:13:48.200 |
And I think this is a lot of where people that are new to these tools struggle to figure out 00:13:53.500 |
So there's a few workflows that I've discovered for using Cloud Code most effectively for myself. 00:13:57.800 |
The first one is have Claude code explore and make a plan and run it by me before it writes code. 00:14:07.400 |
So typically we see extended thinking work really well if Claude already has something in context. 00:14:12.400 |
So have it use tools, have it pull things into context, and then think. 00:14:15.900 |
If it's thinking up front, you're probably just kind of wasting tokens. 00:14:21.000 |
But if there's a lot of context, it does help a bunch. 00:14:27.000 |
I know I try to use TDD, it's like, it's pretty hard to use in practice. 00:14:32.000 |
But I think now with coding tools, it actually works really well. 00:14:35.000 |
And maybe the reason is it's not me doing it, it's the model doing it. 00:14:38.000 |
And so the workflow here is tell Claude to write some tests and kind of describe it. 00:14:43.600 |
And just make it really clear, like the tests aren't going to pass yet. 00:14:45.600 |
Don't try to run the test because it's going to try to run the test. 00:14:47.600 |
Tell it, like, you know, it's not going to pass. 00:14:49.600 |
Write the test first, commit, and then write the code, and then commit. 00:14:52.600 |
And it's kind of a general case of if Claude has a target to iterate against, it can do much better. 00:14:58.600 |
So if there's some way to verify the output, like a unit test, integration test, a way to screenshot in your iOS simulator, 00:15:05.200 |
a way to screenshot in Puppeteer, just some way to see its output. 00:15:11.200 |
Like, we taught Claude how to use a 3D printer, and then it has a little camera to see the output. 00:15:15.200 |
If it can see the output and you let it iterate, the result will be much better than if it couldn't iterate. 00:15:20.200 |
The first shot will be all right, but the second or third shot will be pretty good. 00:15:24.200 |
So give it some kind of target to iterate against. 00:15:27.200 |
Today, we launched plan mode in Claude code, and this is a way to do the first kind of workflow more easily. 00:15:38.800 |
So anytime, hit Shift-Tab, and Claude will switch to plan mode. 00:15:43.800 |
So you can ask it to do something, but it won't actually do that yet. 00:15:46.200 |
It'll just make a plan, and it'll wait for approval. 00:15:48.800 |
So restart Claude to get the update, run Shift-Tab. 00:15:51.800 |
Okay, and then the final tip is give Claude more context. 00:16:00.400 |
So take this file called Claude-MD, put it in the root of your repo. 00:16:13.200 |
So if you put files, like just regular markdown files in these special folders, 00:16:17.200 |
.claude/commands, it'll be available under the Flash Menu. 00:16:25.600 |
And then to add stuff to Claude-MD, you can always type the pound sign to ask Claude to memorize 00:16:33.800 |
something, and it'll prompt you which memory this should be added to. 00:16:36.800 |
And you can see this is us trying to figure out how to use memory, how to use this new concept 00:16:41.200 |
that is new to coding models, did not exist in previous IDEs, how to make the UX of this 00:16:49.600 |
This is our first version, but it's the first version that works. 00:16:52.000 |
And so we're going to be iterating on this, and we really want to hear feedback about what 00:17:09.500 |
Unfortunately, we only have one minute left, so someone sent a question on Slack. 00:17:14.700 |
The question is, as I delegate more and more to Claude code, as it runs for 10 minutes and 00:17:19.900 |
I have 10 of these active, how do I use the tool? 00:17:28.400 |
I think this is something that we actually see in a lot of our power users, that they tend 00:17:32.400 |
You don't just have a single Claude open, but you have a couple terminal tabs, either with 00:17:36.400 |
a few checkouts of Claude or of your code base, or it's the same code base but with different 00:17:44.400 |
This is also a lot easier with GitHub Actions, because you can just spawn a bunch of actions, 00:17:49.900 |
Typically, we don't need to coordinate between these quads, I think, for most use cases. 00:17:53.900 |
If you do want to coordinate, the best way is just ask them to write to a Markdown file, 00:18:02.900 |
And once again, give it up for Boris from Anthropic.