Building headless automation with Claude Code

SIRBIT ASARIA: Good afternoon, everybody. My name is SIRBIT ASARIA. I am an engineer on the Cloud Code team. And today, we're going to be talking a little bit about the Cloud Code SDK and the Cloud GitHub Action that was just announced today. Cool. So a little bit about the agenda.

We do a little quick start for the SDK, just to give some examples of how to get started and how to use the SDK. We will then dive into a live demo of the GitHub Action, which should be fun. The GitHub Action was built on top of the SDK, so it's meant to be a source of inspiration for the kind of things that you can do using the Cloud Code SDK.

We'll then dive into some more advanced features of the SDK. And then we'll end with having all of you set up the Cloud GitHub GitHub Action on your repos, so you guys can start using it and build on top of it. Cool. Actually, before we get started, can I get a show of hands here?

How many people have used Cloud Code? OK. That's a lot of people. And of the people who have used Cloud Code, how many have used Cloud dash P or know what that is? Cloud. OK. OK. OK. Far fewer people. It's good to know. If you guys don't have Cloud Code installed in your laptop, that's how you get it.

I'd encourage you to install it in your laptops and follow along. There will be parts of this talk that will be beneficial to follow along. And then if you don't want to, you don't have to. It's all good. Cool. So what is the Cloud Code SDK? It is a way to programmatically access the power of the Cloud Code agent in headless mode.

This is powerful because it's a new kind of primitive and a new kind of building block that allows you to build applications that just weren't possible before. Things that you can do with the SDK are super simple things to get started. For example, you can use it like a Unix tool.

The Unix-ish tool philosophy is what really makes Cloud Code powerful because you can plug it in anywhere where you can run bash or a terminal. So you can use it in your Unix pipelines. You can pipe stuff into it, pipe stuff out of it, make complex chains out of it and stuff like that.

You can then build CI automation on it. So you can have Cloud review your code. Some people actually get Cloud to write new linters for them too. So Cloud can lint your code if there are specific things that you can't define programmatically. You can get Cloud Code to do it.

And then we get into fancier applications as well. So if you want to build your own chatbot that's powered by Cloud Code, that's certainly possible. If you want Cloud Code to write you code in a new environment or a separate remote environment, you can build those kinds of applications as well.

And finally, these are a few features. We'll talk more about the features in the coming slides. And we have Python and TypeScript SDKs or like bindings for the Cloud Code SDK coming up soon. So that should make it much easier for you guys to consume it and build on top of it.

So let's jump into some basic examples. Calling the Cloud Code SDK is as simple as doing cloud-p and following it up with the string that you want to ask Cloud. So in this example, I'm telling Cloud to write me a Fibonacci sequence generator. And if you notice, I also give it a dash dash allow tools write, which is a way for me to proactively give it access to the right tool so it can write files to my file system.

And then this is something I like doing too, piping logs to Cloud. So you can do cat app dot log and then pipe that into cloud dash p. I don't like looking at logs manually. So this is something I do quite often. And as you can see, it does a pretty decent job of summarizing what the log failures were.

Similarly, if you're anything like me, I just can't get myself to understand the output of if config. I still don't know what it means, but Cloud does. And Cloud does it for me over here. And finally, this is kind of what makes the SDK tick. We have an output format.

If you do a dash dash output format JSON, Cloud code will actually output things or its response in JSON as opposed to plain text. And you can parse this JSON and build on top of it. So we'll talk more about details for how this is-- what else you can do with this JSON, but I wanted to throw that example out there.

Let's get into a significantly more complex example now, which is the Cloud GitHub Action. So Cloud GitHub Action was built on top of the SDK. And it can be used to review code. It can be used to create new features. It can be used to triage bugs and so on.

And this is also open source. So I'll include a link at the very end of the talk, so you guys can go have a look at the source for inspiration for how to use it. But for now, let's jump into a live demo on my laptop. So I have cloned a popular small open source quiz app for the purposes of this demo.

And we are going to fire it up just to see how that works. And then we will tell Cloud to build something on top of it for us. So I just did an NPM start, which opened up my shiny new quiz app. It's actually pretty nifty. It allows you to choose a bunch of categories, how many questions you want, difficulty, definitely easy for me.

I suck at trivia type of questions. And then there's a countdown timer. So we're not going to actually answer these, unless someone feels very strongly. Please shout out the answer. But I'm just going to just fly through these just to show you guys how this little quiz app works.

There we go. Not surprising. You got a great F, but that's OK. So this was the little demo quiz app that's open sourced. And if you look at the issues for this repo, we see a couple very interesting ones. There's one issue that says, we should add power-ups for 50/50 elimination of options and skip questions for free.

Because I suck at trivia, I really like that feature, and I want to build it. And before this presentation, I already installed the Cloud GitHub action in my repo. So it's already available. But we'll go over how to set that up later, too. OK, so here's the issue. It has pretty sparse details on how to implement this.

It's just literally a wish list, really, like a wish feature. It's saying, add a power-up option in the config, 50/50 elimination. For the skip question, it should award user points, even though the question was skipped. And users should be able to configure this from the config page. So there's a lot of creative room for Cloud to do whatever it wants to do in this case.

And I'm excited to see what it actually ends up building. So what I'm going to do is say, add Cloud. Please implement this feature. And comment on it. So it usually does take four or five seconds for it to respond. And while it's doing that, for good measure, we'll just also take this other GitHub issue.

This is talking about a per-question timer. So we saw there was a global timer on the quiz app, but there was no per-question timer. So that's what this one's talking about. So let's go and say, Cloud, please build this. And now we have two things building. Cool. So now when I get back to this tab, you see that Cloud responded with a comment on this GitHub issue saying that it's working.

It also has a link to the job run, which is the GitHub action run. If I click into it, and if I actually click on the logs, I'll see that it's doing a bunch of stuff. You can see all this JSON being output. This is from the SDK. So we won't look at the JSON too much, because it's not much fun to parse it manually.

But over here, we can see that it also created a to-do list for us. So Cloud is now going to actually go through this to-do list and try to implement the power-up feature. And similarly, for the question timer, it's going to do something similar. One more thing that we should do here is there are already a couple of pull requests that have been opened for this repo.

And let's get Cloud to review it or change some of these pull requests just for fun. There's this one, which is change background color to blue. All right, I actually think I like green better. So I'm just going to be like, all right, Cloud. Please change this to green.

This one is fairly easy. And I'm pretty sure Cloud's going to do this. But I just wanted to show you guys that I can also add commits for a pull request that's already open. OK, so this is going to take a few minutes to run. And while this runs, let's go back to the presentation.

And then we'll check up on how this is doing towards the end. OK, cool. So let's do a little bit of a deep dive on the features of the SDK. When you call Cloud-P, by default, it has no edit or destructive permission access, which is great for safety. But it's not great for actually getting things done, which is why there is a dash dash allowed tools option, which allows you to pre-configure Cloud with any permissions that you think it might need in the future for your given task.

So in this case, in the first example, you see that I've given it permissions-- bash permissions to npm run build, npm test, and the write tool, which is a good set of permissions, because this allows Cloud to self-verify what it's writing and build your project and test and then continue writing.

Similarly, for MCP, if you have MCP servers configured, you can allow list those MCP tools as well. So it's a very similar process. Then structured output-- we already saw an example of structured output, both from the GitHub Actions logs and also the little screenshot I showed you earlier. But there's two modes here.

There's stream JSON and JSON. It does exactly what it sounds like. If you select stream JSON, it'll actually stream messages to you as and when they're available, versus JSON will just give you one giant blob of JSON at the end. And parsing this JSON and building on top of it is really how you can make use of the Cloud Code SDK and create features for your users.

And then you can also configure the system prompt. So you can do -- system prompt, talk like a pirate, and you can get Cloud Code to talk like a pirate for the rest of your day, which is actually quite fun. If you haven't done it, I'd encourage you to try it out.

So we also have a few user interaction features built into the SDK. And what that means is that the first one is resuming session state. So when you call Cloud-P in structured output or JSON mode, it's going to return a session ID. And this session ID is useful because you can then reference the session ID to go back to the same context state that Cloud had when it finished that process.

So by preserving these session IDs and keeping track of them, you can enable or build user interactive features, where the user says something, you pass that on to Cloud, Cloud returns a response, and now you want the user to give feedback on that response. And that's how this kind of enables you to build those types of interactions in your apps.

And then the last one -- and this one's actually pretty interesting, and it's fairly recent, too. It's -- it's -- permission prompt tool. We talked a little bit about how to give Cloud permissions using the allowed tools flag. And that requires you to pre-configure them in advance. But what if you didn't want to do them because you don't know what tools Cloud would want to use in the future?

In that case, you can use the -- permission prompt tool and offload the permission management to an MCP server. So you can ask users in real time for whether they want to accept a tool or reject a tool. And you can have an MCP server kind of handle that for you, as opposed to trying to predict which tools are OK and which tools are not.

So this is -- this is fairly recent, and we'd love to get feedback on this if you guys end up trying it out. OK, let's -- let's go back to our demo and see what Cloud's done. All right. So this is the power-up issue. We can see that Cloud has actually gone through his to-do list.

OK, I'm going to open a -- there's a link over here to create a PR. And I'm going to click that and see what that gives us. I'll actually create the pull request, too, so it's easier for us to review. I don't really know how this code base works, but we'll still eyeball it just to see if, you know, it's doing the right thing.

So you see some set power-up stuff. Seems all right. OK, there's, like, some configuration in the main component. All right. I think what we should do and what will make this fun is that we should just get this branch locally and see what Cloud did. Because there's no way that we can actually figure out what it did in the short amount of time that we have.

So I'm going to go back to my terminal, do a good fetch, check out the branch that Cloud just created, and restart our process. OK. Awesome. It looks like we have a power-up section now at the bottom of our config page. And it's a little checkbox. I like that touch.

We'll keep both of them on. And let's select general knowledge and start playing this game. Let's see what it did. Oh, sweet. So you see it has, like, this little 50/50 button on the bottom left and a skip questions button on the right. I'm just going with 50/50 because I have no idea what the answer to this is.

Does anybody know what that is? Cadbury D. D? OK, there we go. That makes sense. Cadbury, yeah. I'm going to skip this one. And then let's just breeze through the other ones for the sake of time. Cadbury D. All right. I still got an F, but we got one correct answer, which is better than zero correct answers.

And, yeah, I guess -- Yeah, it tricked us. That was a good one. But, yeah, I mean, it seems like it worked. I think there's definitely more we could do here. We could, like, show how the power -- like, which questions we use the power upon over here. And there's, like, definitely more we can do.

But at the most basic level, I think Claude was able to do the task that we assigned it to do, which is exciting. Like, this is kind of the power of the GitHub Action because you didn't really have to run this on your own infra. You can just literally comment on a thread saying, please build this for me.

It uses your GitHub Action Runners and just, like, does the thing. We -- let's also look at the PR that we told it changed from blue to green. It's all hex codes, so let's just see what it did in the commits. So we see there's two commits, and Claude has added this last one to switch it from blue to green.

And it did it for all three of the places where we -- where the color was defined, which is awesome. Okay, I'm not going to go over the last one, the question timer, because we might run out of time. But this hopefully gives you insight into what the Claude GitHub Action can do for you.

Let's go back to the presentation now. Okay, so just as a recap, the Claude GitHub Action, as it's implemented today, is able to read your code. It's able to create PRs for you from GitHub issues, like we just saw. It's able to create commits for you. So if you already have a PR and you commit or you comment on it, it can add a commit to an existing branch or an existing PR.

It can answer questions. It doesn't have to do something. It can just literally answer questions for you. If you don't understand something, you can be like, "Hey, Claude, how does this work?" And you can get it to answer questions. And it can, of course, review your code. The best part of all of this is that you don't have to take care of the infra.

It runs on existing GitHub runners, which almost everyone has configured if you're using GitHub Actions. So that's kind of the really nice thing about this is you don't have to worry about any of the infra. Okay, so how were the actions built, right? I think I may have mentioned that these actions were built on top of the SDK.

So the SDK does form the foundation of how these actions were built. And then we have two other actions on top. We have the Cloud Code base action. This is a thin layer that just implements the piece which talks to Cloud Code and returns the response from Cloud Code.

And then we have another action on top of this, which is called the PR action. And this action is responsible for all the fancy things that you saw on the PR. So it's responsible for making comments, for the to-do list, for rendering it the right way, for adding the PR links, and things like that.

So it's kind of three layers in which it's built. Both the base action and the PR action are open sourced. So I would encourage you guys to go have a look, take inspiration from how that works, and maybe that inspires more ideas. Yeah. Yeah. Yeah. And then, finally, we also-- you guys can install the Cloud GitHub actions today.

The easiest way to do this is to open up Cloud Code in a terminal in the repo that you want to install it in. And once you open up Cloud Code, just do /install GitHub action. And that is going to present you with a nice flow which guides you through configuring your GitHub action as well as merging it.

So the end result of this would be a PR, which would be a YAML file for your GitHub action. And once you merge that in and you configure your API keys and things like that, you're off to the races. And you can go ahead and start tagging Cloud and using Cloud like we just did right now.

So, small caveat, if you're a Bedrock or a Vertex user, the instructions are a little bit different and a tiny bit more manual. So please have a look at the docs. The docs are pretty comprehensive in helping you set up the GitHub action for both Bedrock and Vertex. Cool.

Finally, resources. These are resources for things that we've talked about today. If you want to snap a picture, go ahead. The open source repos for both the base action and the Cloud Code action are here. And we absolutely love your feedback as well. So if you guys have any feedback on the SDK, on the GitHub action, or on Cloud Code, please go to our public Cloud Code GitHub repo and file an issue there.

And someone will have a look and get back to you. Cool. That's all I have for today. Thanks for joining me. And I hope you guys have a good rest of the day. And I hope you guys have a good rest of the day. Thank you. Thank you. Thank you.

Thank you. And I hope you guys have a good rest of the day. Thank you. Transcription by CastingWords

Building headless automation with Claude Code | Code w/ Claude

Transcript