back to index

Claude Code & the evolution of agentic coding — Boris Cherny, Anthropic


Whisper Transcript | Transcript Only Page

00:00:01.200 | Hello.
00:00:17.700 | This is awesome.
00:00:21.640 | This is a big crowd.
00:00:22.880 | Who here has used quad code before?
00:00:26.120 | Jesus.
00:00:27.160 | Awesome.
00:00:28.160 | That's what I would like to see.
00:00:30.400 | Cool.
00:00:30.900 | So my name is Boris.
00:00:31.740 | I'm a member of technical staff at Anthropic
00:00:33.980 | and creator of QuadCode.
00:00:35.180 | I was struggling with what to talk about for an audience that
00:00:47.020 | already knows QuadCode, already knows AI, and all the coding
00:00:50.000 | tools, and agentic coding, and stuff like that.
00:00:51.720 | So I'm going to zoom out a little bit,
00:00:54.240 | and then we'll zoom back in.
00:00:56.680 | So here's my tilde R. The model is moving really fast.
00:01:00.940 | It's on exponential.
00:01:02.220 | It's getting better at coding very, very quickly,
00:01:04.620 | as everyone that uses the model knows.
00:01:07.520 | And the product is kind of struggling to keep up.
00:01:09.880 | We're trying to figure out what product to build that's
00:01:11.860 | good enough for a model like this.
00:01:13.600 | And we feel like there's so many more products that could be
00:01:15.800 | built for models that are this good at coding.
00:01:18.440 | And we're kind of building the bare minimum.
00:01:20.260 | And I'll kind of talk about why.
00:01:22.640 | And with QuadCode, we're trying to stay unopinionated about what
00:01:26.840 | the product should look like, because we don't know.
00:01:32.800 | And so for everyone that didn't raise your hand, I think that's
00:01:35.260 | like 10 of you, this is how you get QuadCode.
00:01:39.020 | You can head to the quad.ai/code to install it.
00:01:42.320 | You can run this incantation to install from NPM.
00:01:45.860 | As of yesterday, we support QuadPro plans, so you can try it on that.
00:01:50.060 | We support QuadMax, so yeah, just try it out.
00:01:53.020 | Tell us what you think.
00:01:56.640 | So programming is changing, and it's
00:01:59.080 | changing faster and faster.
00:02:01.460 | And if you look at where programming started back in the 1930s,
00:02:06.020 | '40s, there was like switchboards, and it was this physical thing.
00:02:08.860 | There was no such thing as software.
00:02:11.060 | And then sometime in the 1950s, punch cards became a thing.
00:02:15.300 | And my grandpa, actually, in the Soviet Union,
00:02:18.420 | he was one of the first programmers in the Soviet Union.
00:02:21.480 | And my mom would tell me stories about when she grew up
00:02:23.820 | in the 1970s or whatever, she would bring these big stacks
00:02:26.720 | of punch cards home from work, and she would draw all over them
00:02:31.000 | with crayons.
00:02:32.860 | And that was growing up for her.
00:02:33.940 | And that's what programming was back in the 1950s, '60s, '70s,
00:02:37.980 | even.
00:02:39.160 | But sometime in the late '50s, we started
00:02:41.200 | to see these higher level languages emerge.
00:02:43.760 | So first there was assembly, so programming moves
00:02:46.180 | from hardware to punch cards, which is still
00:02:48.780 | physical, to software.
00:02:50.980 | And then the level of abstraction just went up.
00:02:53.360 | So we got to COBOL, then we got to typed languages,
00:02:56.300 | we got to C++.
00:02:57.940 | In the early '90s, there was this explosion
00:02:59.740 | of these new language families.
00:03:01.280 | There was the Haskell family, and JavaScript and Java,
00:03:05.320 | the evolution of the C family, and then Python.
00:03:09.020 | And I think nowadays, if you kind of squint,
00:03:11.100 | all the languages sort of look the same.
00:03:12.600 | Like, when I write TypeScript, it kind of feels
00:03:14.220 | like writing Rust, and that kind of feels like writing Swift,
00:03:16.600 | and that kind of feels like writing Go.
00:03:18.740 | The abstractions have started to converge a bit.
00:03:20.320 | If we think about the UX of programming languages,
00:03:27.660 | this has also evolved.
00:03:29.900 | Back in the 1950s, you used something like a typewriter
00:03:33.200 | to punch holes in punch cards.
00:03:34.820 | And that was programming back in the day.
00:03:37.740 | And at some point, text editors appeared.
00:03:40.980 | And then Pascal and all these different IDs appeared
00:03:46.320 | that let you interact with your programs and your software
00:03:49.080 | in new ways.
00:03:49.680 | And each one kind of brought something.
00:03:51.720 | And I feel like programming languages have sort of leveled
00:03:54.400 | out, but the model is on an exponential.
00:03:56.100 | And the UX of programming is also on an exponential.
00:03:59.680 | And I'll talk a little bit more about that.
00:04:02.720 | Does anyone know what was the first text editor?
00:04:10.440 | I heard Ed from someone.
00:04:11.440 | I think he read the screen.
00:04:14.440 | Well, before text editors, this is what programming looked like.
00:04:16.680 | So this was the IBM 029.
00:04:18.140 | This was kind of a top of the line.
00:04:20.040 | This was like the MacBook of the time for programming punch cards.
00:04:22.800 | Everyone have this.
00:04:25.640 | You can still find it in museums somewhere.
00:04:28.900 | And yeah, this is Ed.
00:04:29.720 | This is the first text editor.
00:04:31.960 | This was Kem Thompson at the labs invented this.
00:04:35.440 | And it kind of looks familiar.
00:04:37.240 | If you open your MacBook, you can actually still type Ed.
00:04:39.360 | This is still distributed on Unix as part of Unix systems.
00:04:43.660 | And this is crazy, because this thing was invented 50 years ago.
00:04:47.120 | And this is nuts.
00:04:47.980 | Like, there's no cursor.
00:04:49.280 | There's no scroll back.
00:04:51.440 | There's no fancy commands.
00:04:52.540 | There's no type ahead.
00:04:53.240 | There's pretty much nothing.
00:04:54.500 | This is the simple text editor of the time.
00:04:56.280 | And it was built for teletype machines, which were literally
00:04:59.040 | physical machines that printed on paper.
00:05:01.700 | That's how your program was printed.
00:05:03.420 | And this is the first software manifestation of a UX for programming
00:05:06.260 | software.
00:05:07.400 | So it was really built for these machines that didn't support scroll back and cursors or anything
00:05:11.940 | like that.
00:05:15.620 | For all the Vim fans, I'm going to jump ahead of Vim.
00:05:17.840 | Vim was a big innovation.
00:05:18.900 | Emacs was a big innovation around the same time.
00:05:21.360 | I think in 1980, Smalltalk 80 was a big jump forward.
00:05:26.040 | This is one of the first, I think the first, graphical interface for programming software.
00:05:32.000 | And for anyone that's tried to set up, like, live reload with React or Redux or any of this
00:05:37.880 | stuff, this thing had live reload in 1980.
00:05:41.200 | And it worked.
00:05:43.540 | And we were still kind of struggling to get that to work with, like, React JS nowadays.
00:05:49.760 | This was a big jump forward.
00:05:50.760 | And obviously, like, the language, it had object-oriented programming and a bunch of new concepts.
00:05:54.560 | But on the UI side, there's a lot of new things, too.
00:05:58.480 | In '91, I think Visual Basic was the first code editor that introduced a graphical paradigm
00:06:04.600 | to the mainstream.
00:06:05.740 | So before, people were using text-based editors, Vim and things like that were still very popular,
00:06:09.300 | despite things like Smalltalk.
00:06:11.180 | So this kind of brought it mainstream.
00:06:12.420 | This is what I grew up with.
00:06:15.680 | Eclipse brought Type Ahead to the mainstream.
00:06:17.980 | This isn't using AI Type Ahead.
00:06:19.320 | This is not Cursor or Windsurf.
00:06:20.580 | This is just using static analysis.
00:06:22.540 | So it's indexing your symbols.
00:06:24.040 | And then it can rank the symbols and re-rank them.
00:06:25.860 | And it knows what symbols to show.
00:06:27.860 | I think this was also the first big third-party ecosystem for IDs.
00:06:30.720 | Copilot was a big jump forward with single-line Type Ahead and then multi-line Type Ahead.
00:06:41.160 | And I think Devon was probably the first IDE that introduced this next concept and this next abstraction
00:06:47.960 | to the world, which is to program you don't have to write code, you can write natural language, and that becomes code.
00:06:54.760 | And this is something people have been trying to figure out for decades.
00:06:57.260 | I think Devon is the first product that broke through and took this mainstream.
00:07:01.200 | And the UX has evolved quickly.
00:07:08.000 | But I think it's about to get even faster.
00:07:12.000 | We talked about UX and we talked about programming languages and verification is a part of this too.
00:07:18.000 | So verification has started with manual debugging and physically inspecting outputs.
00:07:22.000 | And now there's a lot of probabilistic verification like fuzzing and vulnerability testing.
00:07:28.800 | And like Netflix's chaos testing and things like that.
00:07:34.800 | And so with all this in mind, Cloud Code's approach is a little different.
00:07:38.800 | It's to start with the terminal and to give you as low-level access to the model as possible in a way that you can still be productive.
00:07:46.800 | So we want the model to be useful for you.
00:07:48.800 | We also want to get -- we want to be unopinionated and we want to get out of the way.
00:07:52.600 | So we don't give you a bunch of flashy UI.
00:07:54.600 | We don't try to put a bunch of scaffolding in the way.
00:07:57.600 | Some of this is we're a model company at Anthropic and, you know, we make models and we want people to experience those models.
00:08:03.600 | But I think another part is we actually just don't know.
00:08:06.600 | Like we don't know what the right UX is.
00:08:08.600 | So we're starting simple.
00:08:09.600 | And so Cloud Code, it's intentionally simple.
00:08:13.600 | It's intentionally general.
00:08:14.600 | It shows off the model in the ways that matter to us, which is they can use all your tools and they can fit into all your workflows.
00:08:21.400 | So you can figure out how to use the model in this world where the UX of using code and using models is changing so fast.
00:08:28.400 | And so this is my second point.
00:08:33.400 | The model just keeps getting better.
00:08:35.400 | And this is the better lesson.
00:08:37.200 | I have it -- I have this, like, framed and taped to the side of my wall.
00:08:40.200 | Because the more general model always wins.
00:08:45.200 | And the model increases in capability exponentially.
00:08:48.200 | And there are many corollaries to this.
00:08:50.200 | Everything around the model is also increasing exponentially.
00:08:53.200 | And the more general thing even around the model usually wins.
00:09:00.000 | So with Cloud Code, there's one product.
00:09:02.000 | And there's a lot of ways to use it.
00:09:04.000 | So there's a terminal product.
00:09:05.000 | And, you know, this is the thing everyone knows.
00:09:07.000 | So you can install Cloud Code and then you just run Cloud in any terminal.
00:09:11.000 | We're unopinionated, so it works in iterm2.
00:09:13.800 | It works in WSL.
00:09:15.800 | It works over SSH and TMUX sessions.
00:09:18.800 | It works in your VS Code terminal and your cursor terminal.
00:09:21.800 | This works anywhere, in any terminal.
00:09:23.800 | When you run Cloud Code in the IDE, we do a little bit more.
00:09:30.800 | So we kind of take over the IDE a little bit.
00:09:33.800 | And, you know, diffs, instead of being in line in the terminal,
00:09:36.800 | they're going to be big and beautiful and show up in the IDE itself.
00:09:39.800 | And we also ingest diagnostics.
00:09:43.600 | So we kind of try to take advantage of that.
00:09:46.400 | And you'll notice this isn't as polished as something like, again,
00:09:49.200 | like Hercer, Windsurf.
00:09:50.200 | These are awesome products.
00:09:51.400 | And I use these every day.
00:09:54.600 | This is to let you experience the model in a low-level, raw way.
00:09:58.600 | And this is sort of the minimal that we had to do to let you experience them.
00:10:01.600 | We announced a couple weeks ago that you can now use Cloud on GitHub.
00:10:09.600 | Can I get a show of hands who's tried this already?
00:10:12.400 | So for everyone that hasn't tried this, all you have to do is open up Cloud.
00:10:18.200 | You run this one slash command, install GitHub app.
00:10:20.200 | You pick the repo, and then you can run Cloud in any repo.
00:10:24.200 | This is running on your compute.
00:10:26.200 | Your data stays on your compute.
00:10:28.200 | It does not go to us.
00:10:29.200 | So it's kind of a nice experience.
00:10:32.800 | And it lets you use your existing stack.
00:10:34.400 | You don't have to change stuff around.
00:10:35.800 | It takes a few minutes to set up.
00:10:37.800 | And again, here we intentionally built something really simple because we don't know what the
00:10:42.200 | UX is yet.
00:10:42.800 | And this is the minimal possible thing that helps us learn, but also is useful for engineers
00:10:47.800 | to do day-to-day work.
00:10:48.800 | Like, I use this every day.
00:10:54.600 | The extreme version of this is our SDK.
00:10:56.600 | And this is something that you can use to build on Cloud code without -- if you don't want
00:11:02.600 | to use, like, you know, the terminal app or the IDE integration or GitHub, you can just
00:11:06.200 | roll your own integration.
00:11:07.100 | You can build it however you want.
00:11:08.600 | People have built all sorts of UIs, all sorts of awesome integrations.
00:11:12.400 | And all this is is you run Cloud-P, and you can use it programmatically.
00:11:18.500 | And so, like, something I use it for, for example, is, for instance, in Triage, I'll take my GitHub
00:11:22.500 | logs -- sorry, my GCP logs.
00:11:25.800 | I'll pipe it into Cloud-P because it's, like, it's a Unix utility.
00:11:28.500 | So you can pipe in, you can pipe out.
00:11:30.900 | And then I'll, like, JQ the result.
00:11:33.000 | So it's kind of cool.
00:11:34.000 | Like, this is a new way to use models.
00:11:35.700 | This is maybe 10% export.
00:11:37.500 | No one has really figured out how to use models as a Unix utility.
00:11:41.900 | This is another aspect of code as UX that we just don't know yet.
00:11:47.000 | And so, again, we just built the simplest possible thing so we can learn and so people can try
00:11:50.200 | it out and see what works for you.
00:11:55.500 | Okay, I wanted to give a few tips for how to use quad code.
00:11:58.200 | This is a talk about quad code, so this is kind of zooming back in.
00:12:03.100 | And this is actually true for, I think, a lot of coding agents, but this is kind of
00:12:06.400 | custom to the way that I personally use quad code.
00:12:08.800 | So the simplest way to use this -- it seems like most of this room is very familiar with
00:12:14.100 | quad code and similar coding agents.
00:12:15.900 | But the simplest way to introduce new people that have not used this kind of tool before is
00:12:20.900 | do code-based Q&A.
00:12:21.800 | And so at Anthropic, we teach quad code to every engineer on day one, and it's shortened onboarding
00:12:27.900 | times from, like, two or three weeks to, like, two days maybe.
00:12:29.900 | And also, I don't get bugged about questions anymore.
00:12:33.300 | You can just ask quad, and honestly, like, I'll just ask quad, too.
00:12:37.600 | And then this is something that I do pretty much every day on Monday.
00:12:42.300 | We have a standup every week.
00:12:43.300 | I'll just ask quad what did I ship that week.
00:12:44.300 | It'll look through my git commits, and it'll tell me so I don't have to keep track.
00:12:51.700 | The second thing is teach quad how to use your tools.
00:12:53.700 | And this is something that has not really existed before when you think about the UX of programming.
00:12:58.700 | With every IDE, there's sort of like a plug-in ecosystem.
00:13:01.700 | You know, for Emacs, there's this kind of Lispy dialect that you use to make plug-ins.
00:13:06.500 | If you use Eclipse or VS Code, you have to make plug-ins.
00:13:08.800 | For this new kind of coding tool, it can just use all your tools.
00:13:11.600 | So you give it bash tools, you give it MCP tools.
00:13:14.500 | Something I'll often say is here's the CLI tool, Claude, run dash dash help, take what you learn,
00:13:20.400 | and then put it in the Claude MD.
00:13:22.300 | And now Claude knows how to use the tool.
00:13:24.000 | That's all it takes.
00:13:26.200 | You don't have to build a bridge.
00:13:27.200 | You don't have to build an extension.
00:13:28.800 | There's nothing fancy like that.
00:13:30.700 | Of course, if you have groups of tools, or if you have fancier functionality like streaming
00:13:34.000 | and things like this, you can just use MCP as well.
00:13:40.300 | Traditional coding tools focused a lot on actually writing the code, and I think the new kinds
00:13:46.200 | of coding tools, they do a lot more than that.
00:13:48.200 | And I think this is a lot of where people that are new to these tools struggle to figure out
00:13:51.800 | how to use them.
00:13:53.500 | So there's a few workflows that I've discovered for using Cloud Code most effectively for myself.
00:13:57.800 | The first one is have Claude code explore and make a plan and run it by me before it writes code.
00:14:06.400 | You can also ask it to use thinking.
00:14:07.400 | So typically we see extended thinking work really well if Claude already has something in context.
00:14:12.400 | So have it use tools, have it pull things into context, and then think.
00:14:15.900 | If it's thinking up front, you're probably just kind of wasting tokens.
00:14:19.400 | And it's not going to be that useful.
00:14:21.000 | But if there's a lot of context, it does help a bunch.
00:14:25.000 | The second one is TDD.
00:14:27.000 | I know I try to use TDD, it's like, it's pretty hard to use in practice.
00:14:32.000 | But I think now with coding tools, it actually works really well.
00:14:35.000 | And maybe the reason is it's not me doing it, it's the model doing it.
00:14:38.000 | And so the workflow here is tell Claude to write some tests and kind of describe it.
00:14:43.600 | And just make it really clear, like the tests aren't going to pass yet.
00:14:45.600 | Don't try to run the test because it's going to try to run the test.
00:14:47.600 | Tell it, like, you know, it's not going to pass.
00:14:49.600 | Write the test first, commit, and then write the code, and then commit.
00:14:52.600 | And it's kind of a general case of if Claude has a target to iterate against, it can do much better.
00:14:58.600 | So if there's some way to verify the output, like a unit test, integration test, a way to screenshot in your iOS simulator,
00:15:05.200 | a way to screenshot in Puppeteer, just some way to see its output.
00:15:09.200 | We actually did this for robots.
00:15:11.200 | Like, we taught Claude how to use a 3D printer, and then it has a little camera to see the output.
00:15:15.200 | If it can see the output and you let it iterate, the result will be much better than if it couldn't iterate.
00:15:20.200 | The first shot will be all right, but the second or third shot will be pretty good.
00:15:24.200 | So give it some kind of target to iterate against.
00:15:27.200 | Today, we launched plan mode in Claude code, and this is a way to do the first kind of workflow more easily.
00:15:38.800 | So anytime, hit Shift-Tab, and Claude will switch to plan mode.
00:15:43.800 | So you can ask it to do something, but it won't actually do that yet.
00:15:46.200 | It'll just make a plan, and it'll wait for approval.
00:15:48.800 | So restart Claude to get the update, run Shift-Tab.
00:15:51.800 | Okay, and then the final tip is give Claude more context.
00:15:58.000 | There's a bunch of ways to do this.
00:15:59.000 | Claude-MD is the easiest way.
00:16:00.400 | So take this file called Claude-MD, put it in the root of your repo.
00:16:03.400 | You can also put in subfolders.
00:16:05.200 | Those will get pulled in on-demand.
00:16:06.400 | You can put in your home folder.
00:16:07.400 | This will get pulled in as well.
00:16:09.400 | And then you can also use Flash Commands.
00:16:13.200 | So if you put files, like just regular markdown files in these special folders,
00:16:17.200 | .claude/commands, it'll be available under the Flash Menu.
00:16:22.600 | So pretty cool.
00:16:23.600 | This is useful for reusable workflows.
00:16:25.600 | And then to add stuff to Claude-MD, you can always type the pound sign to ask Claude to memorize
00:16:33.800 | something, and it'll prompt you which memory this should be added to.
00:16:36.800 | And you can see this is us trying to figure out how to use memory, how to use this new concept
00:16:41.200 | that is new to coding models, did not exist in previous IDEs, how to make the UX of this
00:16:46.000 | work.
00:16:47.000 | And you can tell this is still pretty rough.
00:16:49.600 | This is our first version, but it's the first version that works.
00:16:52.000 | And so we're going to be iterating on this, and we really want to hear feedback about what
00:16:55.500 | works about this UX and what doesn't.
00:16:59.500 | Thanks.
00:17:00.500 | Thank you, Boris.
00:17:09.500 | Unfortunately, we only have one minute left, so someone sent a question on Slack.
00:17:14.700 | The question is, as I delegate more and more to Claude code, as it runs for 10 minutes and
00:17:19.900 | I have 10 of these active, how do I use the tool?
00:17:22.400 | You got 50 seconds.
00:17:23.400 | Yeah.
00:17:27.400 | It's pretty cool.
00:17:28.400 | I think this is something that we actually see in a lot of our power users, that they tend
00:17:31.400 | to multi-Claude.
00:17:32.400 | You don't just have a single Claude open, but you have a couple terminal tabs, either with
00:17:36.400 | a few checkouts of Claude or of your code base, or it's the same code base but with different
00:17:41.400 | work trees.
00:17:42.400 | And you have Claude doing stuff in parallel.
00:17:44.400 | This is also a lot easier with GitHub Actions, because you can just spawn a bunch of actions,
00:17:47.900 | and get Claude to do a bunch of stuff.
00:17:49.900 | Typically, we don't need to coordinate between these quads, I think, for most use cases.
00:17:53.900 | If you do want to coordinate, the best way is just ask them to write to a Markdown file,
00:17:57.900 | and that's it.
00:17:58.900 | Awesome.
00:17:59.900 | Yeah.
00:18:00.900 | Both things work.
00:18:01.900 | Thank you so much.
00:18:02.900 | And once again, give it up for Boris from Anthropic.
00:18:04.900 | Thank you so much.
00:18:05.400 | Thank you.
00:18:05.400 | Thank you.
00:18:05.400 | Thank you.
00:18:05.400 | Thank you.
00:18:05.400 | Thank you.
00:18:05.900 | Thank you.
00:18:06.400 | Thank you.
00:18:07.400 | Thank you.
00:18:07.400 | We'll see you next time.