back to indexShip Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger

00:00:00.000 |
Okay, we're going to kick this off. We're trying to sort out some internet options here, 00:00:19.640 |
but in the meantime, I'll give us our little intro. So first of all, I'm Kyle, this is Jeremy, 00:00:26.840 |
we're from Dagger, and you'll see more about what Dagger is through this workshop where 00:00:31.780 |
we're going to build a cool suite agent, and we're actually going to deploy it to GitHub, 00:00:38.060 |
and so even like worst case scenario, if we can't get things running locally when we actually 00:00:42.580 |
push things to GitHub and see agents running GitHub, then that's going to be out of our 00:00:47.380 |
internet hands, and it's all going to be really cool. So first of all, on the left side here, 00:00:53.680 |
we have kind of where we're getting started from. So this is the documentation site where 00:01:01.700 |
we have install instructions, so I'll walk through those real quick. And then our quick 00:01:08.020 |
starts where we're actually going to walk through these as like the content of this workshop. 00:01:13.460 |
And then also a shout out to tomorrow night, we have a hack night at the CloudFlare office. It's on the external events list for this conference as well. But here's a QR code for it. 00:01:31.400 |
So real quick. So there's a question about whether there's a Slack, I think? Slack for the workshop? Yes, absolutely. So if you go to the Slack, there's it says dagger dash workshop, ship agents, the ship, got it? Okay. 00:01:50.420 |
Let me pull that up as well. So if there's questions, put them in there or raise your hand and Jeremy will get to you. Climb over people. 00:01:58.440 |
I will. I will. I will. Yeah. Make it happen. Yes. I'll do my best. Awesome. So yeah, if everyone, if you're following along, awesome. If you can't like because you don't have desk room or can't get the internet or whatever, I'm going to walk through it live because I already have everything on my machine. 00:02:17.440 |
And then you can always, you know, check back with this later on once you have a solid connection. So if you're not able to get out your computer or follow along, just watch me and I'll go through it and it's gonna be really neat. 00:02:30.440 |
But if you are following along, here's the installation page on docs.dagger.io. So you can install from the homebrew tab or straight from our install scripts or with Winget. 00:02:46.440 |
You can install the Dagger CLI. The only other dependency is that you need a container runtime such as Docker or Podman or Nerd CTL. So like anything that can run containers because Dagger itself runs its engine as a container. 00:03:02.440 |
And I'll explain what that means in a second. But if you're following along, get started on this while I talk through a bunch of stuff about what we're actually doing and what all these technologies are trying to accomplish. 00:03:13.440 |
So I think I'll pause you really quick. Yeah, just for the for the folks on the tech team. And so for some of you in the room, we're having we're finding the Wi Fi may or may not work for you. 00:03:24.440 |
Use a hotspot if you got one if that works down in the basement. You're then you're amazing. Also, I'm trying to do a little something through the wired connection here. 00:03:32.440 |
But for the tech team, it's requiring a password for me to use this service. So anyway, if you have that, slip me a note at some point. 00:03:42.440 |
But otherwise, yeah, we're working on we're working on getting more connectivity as we speak. 00:03:48.440 |
Awesome. Yep, there we go. So the QR is actually for the the hack night tomorrow night, the the docs and what we're going through our docs.dagger.io. 00:04:00.440 |
So that's like the main content for what we're going to walk through. 00:04:04.440 |
And I guess real quick, we can intro as well if you want to intro yourself first, Jeremy. 00:04:10.440 |
Yeah, sure. I'm Jeremy Adams. I look after the ecosystem. I'm part of the ecosystem team that Kyle and I are both on and been at Dagger for a few years. 00:04:23.440 |
And so I've got to see already a progression of folks using us for all sorts of things and most recently a lot around AI agent kind of workflows. 00:04:35.440 |
But I love in this workshop, we're going to blend together some of the classic use cases we've seen with Dagger around CI and dev workflows as well as, you know, giving those to agents. 00:04:47.440 |
Awesome. Yeah, I'm Kyle. I'm on the same team. Yeah. 00:04:51.440 |
And I have a background in like DevOps and platform engineering, so much more on the kind of cloud infra side of things versus that building side. 00:05:02.440 |
So it's cool to come at this from that perspective of, you know, trying to deploy agents somewhere and make things work. 00:05:09.440 |
And that's why in this workshop, we're going to deploy things to GitHub because that's eventually what you're going to want to do when you build an agent. 00:05:17.440 |
And you have to put it somewhere to run it. I can't just live on your machine all the time, I guess, depending on what agent is. 00:05:23.440 |
Anyway, so if you made it this far, you've made it to the docs, then we're going to talk a bit about what Dagger is and why we're building agents with it. 00:05:33.440 |
And so basically Dagger is, like I said, it's a container runtime. 00:05:39.440 |
It's a workflow engine. And so people have historically done things like build their CI/CD with Dagger because you're building these pipelines that orchestrate containers, run all these tasks, and it runs the same on your machine as it runs in any cloud, like in, you know, in your Kubernetes, in GitHub, wherever your CI might run. 00:06:00.440 |
But it runs the same everywhere. So you're making these workflows. And the cool thing is, that's also what agents are, right, is that they're just these processes where we have a bunch of tools we want to give to an agent. 00:06:16.440 |
Anyway, okay, we're gonna see an action. And so, let's see, we have components, right? So Dagger itself is made up of core components like containers, like I said, also repos, directories, files, and now LMs are also a component that you have to work with, within this kind of toolbox of Dagger and how we're building things. And so it's just another building block, right? It's not a framework special, like, just built for 00:06:44.440 |
making an agent, then you have it living next to your software. It's another component within your toolbox. And so you're bringing LMs into these existing workflows. And that's why it's a little bit different. 00:06:56.440 |
Yeah, you could think another way of thinking about Dagger in a nutshell is Dagger is for software engineering workflows and environments. So you're going to see us building some environments, essentially some containerized environments with some functions, and all these things can become tools that, you know, human software engineers, AI agents, use for both development side, as well as app delivery side of things. 00:07:14.440 |
So you'll see that kind of course these areas are all blending and kind of squishing together right now. We're seeing all this stuff happen in real time. So Dagger is kind of going to be one tool that you could use for that whole range. 00:07:36.440 |
Yeah, so the question was, if Dagger is a tool to build containerized environments, what's the distinction between Dagger and Docker? So yeah, Docker has been around for a long time. And in fact, the founders of Docker are the founders of Dagger. And so we can think about the scope of the original scope of Docker was really about containerizing an application. 00:07:54.440 |
And making that thing portable. So I can run on my laptop or Kyle's or up in Kubernetes or anywhere. So now what we're doing is we're taking a whole workflow and making that a portable thing. 00:08:15.440 |
So yeah, there's definitely multiple containers and other types of objects, but everything's everything sandboxed by default. So you get so yeah, and we'll see as we get into it. Great question. 00:08:27.440 |
Yeah, so yeah, we're writing code that is workflows itself. So that code can be go Python TypeScript, Java, PHP, we have all these different languages you can write with. And the cool thing is that you're not kind of choosing your language for Dagger and then that's the world you live in. 00:08:44.440 |
Dagger has this cross language interop. So if I write a cool Dagger module with oh, it's it's gonna want to load. That's why I have it. Oh, yeah, amazing. Yeah. Yeah. Error occurred. Wow. Okay. 00:08:59.440 |
I don't know. Okay. So I broke it. If I write a cool module with Dagger that say does like TypeScript build or something, right? I can share this on the Dagger version. 00:09:13.440 |
And maybe I wrote that module in TypeScript and you're writing your modules in Python. You can just install my module and you have these native bindings in your language to work with Dagger modules cross language. So anyway, that's that's my point in that we're not when you pick a language, you still get to benefit from the whole Dagger ecosystem. 00:09:33.440 |
ecosystem, and we don't have images on these. 00:09:38.900 |
There's some sweet animations there of code happening. 00:09:42.720 |
Yeah, like the coolest animation you could think of. 00:09:47.160 |
So I think we could probably skip forward here. 00:09:49.820 |
And so hopefully we've installed or we're downloading-- 00:09:56.800 |
So I'll run through real quick the basics of Dagger. 00:10:10.880 |
We've got these things like container runtime somewhere. 00:10:15.520 |
And so the first thing we can do is create containers. 00:10:17.580 |
So if I'm in Dagger shell, which I think I am over here-- 00:10:29.260 |
So I'm in Dagger shell and I can say container, I think. 00:10:50.880 |
So there's a few different ways of using Dagger on the command 00:10:55.720 |
line and including kind of a non-interactive, just fire off a 00:11:00.100 |
Dagger command to run one of these workflows, a function that's 00:11:04.280 |
Or you can use it in this kind of interactive shell mode 00:11:08.500 |
Yeah, and it's all about building building blocks, right? 00:11:10.440 |
So with the basics of Dagger, you have, like I mentioned earlier, 00:11:18.160 |
to be building larger blocks out of those blocks 00:11:21.160 |
to assemble like an actual part of a workflow. 00:11:24.100 |
And then I'll take those blocks, build bigger workflows out of those. 00:11:29.200 |
going to be interacting with some level of a workflow here. 00:11:32.160 |
But like with container, I can say from Alpine, 00:11:38.380 |
And I can get-- we can do things with that, like anything 00:11:44.080 |
you might want to do with a container, right? 00:11:46.280 |
So I could literally say, give me a terminal. 00:11:51.880 |
And this is the exact kind of tools that we're actually giving 00:11:54.180 |
to our agent as we're building these pipelines, right? 00:11:59.520 |
but you can build a specialized workspace for your agent 00:12:07.020 |
got all these primitives that we can give to agents 00:12:11.620 |
to build some really effective software engineering agents 00:12:14.760 |
by giving them the exact tools they need to complete the job. 00:12:23.760 |
can create these workflows for running your tests 00:12:27.900 |
And the cool thing is that if you've done that, 00:12:29.920 |
you can take that same code that you wrote for running your tests 00:12:33.640 |
So now your agents isn't just guessing at some code 00:12:37.560 |
run your actual tests the same way that your developers 00:12:40.120 |
and your CI do to make sure that the code that's generating 00:12:43.480 |
is valid code and that it can iterate on these things 00:12:48.100 |
So this is all we're going to build right now. 00:13:01.920 |
Because of people like a product manager who's discovered 00:13:05.420 |
vibe coding or a team that's using AI-powered IDEs or whatever, 00:13:12.940 |
that people are cranking out these massive PRs for him to review, 00:13:22.760 |
So he was like, oh, I just got this PR and I have to review it. 00:13:26.360 |
And then I come back and now there's five more commits on it. 00:13:33.440 |
And so part of the reason why CI and AI bringing that together 00:13:48.140 |
But how do we make sure this is actually code that we can test 00:13:56.340 |
that there's software delivery workflows that are there 00:14:08.240 |
So yeah, let's actually get into writing something. 00:14:10.400 |
So I zoomed up a bit so you can see where I landed it 00:14:22.540 |
to make an agent inside of that that can build new features 00:14:28.260 |
where we kind of set up this example project with functions 00:14:37.560 |
And we've already talked through installing Dagger. 00:14:43.380 |
So now we have this example application called Hello Dagger 00:14:47.060 |
And if you go to GitHub and say, use this template, 00:14:50.760 |
you can name it whatever you want, like Hello Dagger Workshop 00:14:55.860 |
You can create a repo in your GitHub from this template. 00:14:59.940 |
And the important reason for that versus cloning it 00:15:02.740 |
is that that's going to make it way easier when we actually push 00:15:05.020 |
things to GitHub in a little bit to make it easier for you 00:15:08.520 |
to run the GitHub actions that actually run the agent. 00:15:14.680 |
And I've done that over here in this repo where 00:15:25.380 |
you can use whatever language you want to use. 00:15:28.100 |
I'll be walking through Python today because I think that's 00:15:35.580 |
But if you're not, I can switch between languages. 00:15:45.760 |
So I've got this application in my GitHub now. 00:15:53.960 |
And it's like this view app that has a bunch of things in it. 00:16:01.460 |
able to make an agent that develops it, right? 00:16:04.900 |
Optionally, you can configure Dagger Cloud, which 00:16:06.760 |
is-- let me just start loading that web page now. 00:16:14.360 |
So you can really easily see what your agent's doing, right? 00:16:17.000 |
Because that's the hardest part of building agents a lot of the time, 00:16:29.020 |
So with this visualization, you're able to really easily see everything 00:16:35.480 |
And that's helped me a lot, like, develop my prompts. 00:16:37.340 |
Like, if I see-- the prompts and environments, right? 00:16:40.280 |
If I see a lot of times that, OK, the agent fails because it tries 00:16:43.980 |
to call this tool incorrectly, I can improve, like, 00:16:48.280 |
Or maybe I need to change how the tool works completely. 00:16:51.520 |
And so being able to see how the agent's behaving is a huge part of that. 00:16:55.620 |
Whether you're using cloud or any other thing to visualize your agents, 00:16:59.480 |
that's, like, the most important part of making it reliable. 00:17:08.340 |
So if you've installed Dagger, you'll run this command, DaggerInit, 00:17:17.480 |
And then the name of the module is going to be HelloDagger. 00:17:20.400 |
And that's important because that is basically the name of our object 00:17:26.360 |
So if I open this up, and I've run DaggerInit, 00:17:33.760 |
I don't remember how to make that bigger in Zed, but we can-- 00:17:55.760 |
Well, I guess you don't have your set, but it's a font size. 00:18:10.140 |
So now hopefully we can see the sidebar a little bit better. 00:18:18.600 |
It's just-- oh, because that's the wrong project. 00:18:23.960 |
Let me open that up and close all these things. 00:18:31.380 |
And in my dot Dagger, I've got this source, hello Dagger, main pie. 00:18:35.560 |
And so we would have generated when we said Dagger in it. 00:18:39.440 |
It'll have basically these files, but some different content. 00:18:43.340 |
So it's going to have the basic generated things to get you started building modules. 00:18:49.280 |
But we're going to say, let's see, Dagger functions. 00:18:54.580 |
It'll show us what's available in this Dagger module that just got created. 00:18:57.660 |
And so this is basically how you interact with Dagger. 00:19:01.000 |
And you have this code that are just functions of how to interact with your application. 00:19:11.600 |
If we go down to this function, and you see we're just building building blocks. 00:19:15.460 |
We have a function that gives us a Dagger container that is from this base. 00:19:25.120 |
And so in that container, when we want to do a build of our app, 00:19:28.720 |
we can call that other function to get that container with our code in it, 00:19:34.780 |
run another command, and then get a directory from that. 00:19:37.960 |
And so this is like really basic Dagger stuff of how you create your dev tools using Dagger. 00:19:45.580 |
So originally we had this example from Kyle where he showed us running like a container. 00:19:51.520 |
And then we said, give me a scratch container. 00:19:53.800 |
Oh wait, give me an imp from an image from Alpine or from Node or from whatever. 00:19:58.720 |
And then you can layer on more things like add a directory, a source code to that, run exec a test 00:20:04.780 |
command, whatever, right, chaining these things together. 00:20:07.340 |
So you notice I'm using this builder pattern here in code instead of in like a CLI. 00:20:15.620 |
It's just in this, in this case, he's using a Python SDK into that same API. 00:20:21.180 |
But the same things are happening either way, same one unified cache where all that stuff is being, 00:20:27.240 |
all those cached operations are at, and one API. 00:20:30.240 |
So that's why it becomes really easy to use different languages, different language SDKs, 00:20:35.740 |
because it's ultimately all one API under the hood. 00:20:39.240 |
And so we got this code in this, the next step of this, where it says construct a pipeline. 00:20:45.800 |
And that has all of those functions like publish, build, tests, and that build_env one we looked at, 00:20:52.800 |
and build_env as in like your build environment. 00:20:56.080 |
And so when we run dagger functions, we'll have those shown up here with their descriptions and everything from the code. 00:21:02.180 |
So now we've, at this point, like we, we've got the project that we want to build the agent in. 00:21:07.140 |
We've got some dagger functions that let us build and test the project. 00:21:15.240 |
And so now I'll zoom out again so you can see because I jumped to the next page here, 00:21:19.620 |
which is add an AI agent to an existing project. 00:21:23.420 |
And so we're starting from exactly where we just left off there with that previous guide where we pasted in that code. 00:21:30.180 |
We have our build, build_env, publish, test, and our dagger functions. 00:21:36.240 |
Lots of useful functions, but the expectation was the human was probably running those, right? 00:21:45.480 |
You'd kind of set that up, but nothing really agentic yet. 00:21:48.820 |
Right. So we have, you know, we're just running our unit tests or our build and creating a production container. 00:21:54.180 |
And this is what you as a developer or your CI environment are running these functions. 00:22:00.240 |
But now we want to create an agent for developers to interact with or, you know, to run anywhere. 00:22:06.300 |
But also our agents should be able to use these functions as well. 00:22:15.360 |
Because I mentioned, like, our agents want these refined environments where we give them access to exactly what tools they need to complete their tasks. 00:22:25.420 |
No, no, wait, I thought you were going to give agents, like, every possible tool. 00:22:28.420 |
You want to let them have, like, a thousand functions that do very powerful things and just let them run crazy. 00:22:37.420 |
Well, yeah, maybe not based on the smiles across the room. 00:22:43.420 |
What happens if the tool needed changes at runtime? 00:22:48.480 |
So the question is, what if the tool needed changes at runtime? 00:22:52.480 |
So in a lot of cases, we're working with MCPs. 00:22:54.480 |
We might have a lot of static tool kind of experience. 00:23:00.480 |
Well, so the main thing is, like, you want the right amount of tools for that agent to solve its task, whatever that task is. 00:23:07.420 |
Like, it needs the flexibility to be able to solve complex problems. 00:23:13.420 |
So it's not just going straight down a workflow and saying, okay, I do this, and I do this, and I do this, because you don't really need an AI to do that. 00:23:20.420 |
It needs the amount of tools to select, to choose its own path, to solve whatever task you throw at it. 00:23:26.420 |
But you don't want so many tools that now this is a generalized agent that does anything, right? 00:23:31.420 |
It needs to have some amount of focus so that it can solve a specific set of problems really well. 00:23:37.420 |
But we will see, like, in the agent loop that's going to happen, we will see the ability for the LLM to see this, like, menu of tools it has, and for it to select the right tool at the right time, given the context. 00:23:50.420 |
Yeah, but yeah, definitely, like, a big part of iterating and building these agents is determining, like, the scope of the tools. 00:23:58.420 |
So, like, the kind of the balance between flexibility and reliability, where you want it to be able to solve a breadth of problems. 00:24:07.420 |
So it needs a variety of tools that it might need. You don't know exactly what it's going to need ahead of time. 00:24:12.420 |
But you don't want to give it so many that now it's getting lost and confused and fails half the time, right? 00:24:17.420 |
And so that's what we're going to focus on here with this. 00:24:20.420 |
We're going to create a sub-module, basically, that is kind of its playground. 00:24:25.420 |
It's a specific set of tools that lets it edit our source code. 00:24:29.420 |
And so if you've worked with maybe agent frameworks in the past that have, like, file system tools, we're actually going to build that in our own code right now. 00:24:39.420 |
And it's just a few lines of code, so don't let me scare you with that. 00:24:42.420 |
But that's the idea, is, like, we're creating these building blocks, and as you scale this up, you can consume these that other people have written. 00:24:50.420 |
You don't have to write it all from scratch. But for the practice of building this as a workshop, we're going to write it all. 00:24:57.420 |
All right. What are we going to give the-- what are we going to give-- what are we putting this workspace? 00:25:01.420 |
Yeah. So we do another dagger in it here, and we say dot dagger slash workspace. 00:25:06.420 |
So we've created in our file system another subdirectory here, workspace under dot dagger. 00:25:15.420 |
And this one's just going to have-- just the functions that we want the agent to have access to. 00:25:19.420 |
So you can imagine it wants to read the files in your source tree. 00:25:23.420 |
So we have a function, and again, a file is one of those core components of dagger. 00:25:28.420 |
And so we just-- our workspace has a dagger directory, which is our source code. 00:25:33.420 |
And so we give it a function to read a file from that. 00:25:36.420 |
So it just gets the dot contents of that file. 00:25:40.420 |
And that's just the dagger API to say, this is a path through a file. 00:25:48.420 |
Another function it needs is to be able to write files to the workspace, obviously. 00:25:52.420 |
And so it's very similar API here where we say, okay, give me the path and also the contents 00:26:00.420 |
And then it needs to be able to know what files are in the workspace. 00:26:05.420 |
And it's just going to literally do a tree in that workspace. 00:26:08.420 |
So it can quickly see the file structure of your code. 00:26:12.420 |
And so now basically with those three, we have another one that we're going to look at in a second. 00:26:16.420 |
But with those three, now it can do all the code editing you might ask it to do within your file system. 00:26:23.420 |
And with more complex projects, you might need more advanced capabilities of these. 00:26:29.420 |
Like you might need to be able to read specific lines from a file or scan files or insert lines into files. 00:26:36.420 |
But with our kind of demo agent that we're building right now, it's like just the most basic 00:26:41.420 |
where we can just read and write files and list the files. 00:26:44.420 |
So if the agent had access to this workspace object, it would see those functions as tools. 00:26:58.420 |
We haven't plugged the brain into the robot body yet. 00:27:01.420 |
So right now we're building, if you think about the agent as a robot body with a brain plugged into it, 00:27:06.420 |
And the brain is going to come in just a second here. 00:27:09.420 |
Which could be any LLM, kind of a brain in a jar, right? 00:27:13.420 |
So our last one, finally, that I mentioned earlier, is test. 00:27:17.420 |
So when it generates this code in its workspace, it needs to be able to test to make sure that the code it generated is correct. 00:27:25.420 |
And if it didn't, it'll get the test failures and iterate until it's producing good code, right? 00:27:31.420 |
And so this is kind of the most important part of building this good agent is some sort of validation tool, 00:27:36.420 |
whether that's like a test or a lint or just something to check that what it's generated is correct. 00:27:45.420 |
There could be different levels of complexity. 00:27:47.420 |
But anyway, here, now we've got this workspace. 00:27:50.420 |
So if I go in my workspace, I have this exact code over here. 00:27:56.420 |
And if I run, I think I have the function down here. 00:28:00.420 |
If I say dagger dot dash M, so now dash M points to a specific dagger module. 00:28:06.420 |
And I say functions, remember before we ran those dagger functions. 00:28:09.420 |
If I run dagger dash M dot dagger slash workspace functions, I'll see exactly those functions that we just created. 00:28:20.420 |
So the next step is we want our main dagger module to have that as a set of tools I can use. 00:28:26.420 |
And so we're going to say dagger install that workspace module. 00:28:31.420 |
So now it's installed as a dependency of my main module. 00:28:34.420 |
So it has this object available and we'll see why that's really cool in a second. 00:28:40.420 |
But basically all your dependencies of dagger, like I mentioned, like we have, you know, I can look at this real quick. 00:28:47.420 |
We have a big community of people building things with dagger. 00:28:52.420 |
And with that we have the dagger verse, which is this massive index of like thousands of dagger modules that do different specialized things. 00:28:59.420 |
But whenever you install one of these into a dagger module, it creates, if you look at my dagger JSON in this project, we have this list of dependencies. 00:29:11.420 |
And so your dagger module has basically its own dagger clients that is the core dagger API in addition to all of your dependencies. 00:29:21.420 |
And so that when you're writing code, you can, like I mentioned earlier, like native in this language, you'll see all of these things available on the main dagger clients. 00:29:33.420 |
So basically we've built two modules already. 00:29:36.420 |
We've built this workspace module and the main module where we're doing our tests and builds. 00:29:42.420 |
And so we want to create the agent now that can take that workspace and our tests and we can actually ask for new features or modifications or whatever. 00:29:52.420 |
So that's the next step in this guide we're looking at where we want to create an agentic function. 00:29:58.420 |
Could we have mixed and matched, like, could we have written that workspace in TypeScript or in Go and still installed it into our Python module? 00:30:10.420 |
So like the other modules, any individual module can be written in any language and you can mix and match however you want. 00:30:19.420 |
But, but yeah, we see people do this a lot where they have different teams where like, you know, maybe there's a front end platform team and then a black back end platform team. 00:30:28.420 |
And maybe these folks are TypeScript, these folks are Go, but they can interop and use their stuff. 00:30:34.420 |
So like everything, every task or workflow or whatever that you do with dagger is a function in your code. 00:30:43.420 |
And we're going to call this one develop because we're going to ask it, we're going to give it an assignment to complete our project. 00:30:56.420 |
And so this is going to give us the code to copy and I'll just open it in the editor so that it looks a bit nicer. 00:31:17.420 |
So we have a new function called develop and it takes an assignment. 00:31:22.420 |
And this annotated thing is just a Python way of getting us these doc strings for the parameters. 00:31:30.420 |
But in different languages, like we can see back here, if we go like go, your arguments just look like this. 00:31:36.420 |
Where this little comment is basically the help string when you're using the Dagger CLI and say Dagger functions. 00:31:42.420 |
It'll say the assignment parameter is assignment to complete, which is really cool. 00:31:48.420 |
And we see our source here, which is like our project source. 00:31:51.420 |
But of course, we don't want to have to pass that as a parameter when we're calling our agent. 00:31:55.420 |
So there's this cool thing with Dagger where you say default path is slash and that's going to be the root of our Git repo. 00:32:01.420 |
So if we don't pass in explicitly a source parameter, it's just going to pass in our Git repo as that parameter. 00:32:08.420 |
And so now we just have to say develop, build me a cool new feature and it's going to kick off our agent. 00:32:14.420 |
So let's look at the components of the agents real quick. 00:32:17.420 |
So the environment is like the main thing, right? 00:32:22.420 |
And hopefully a lot of people are using the same word in the same way. 00:32:26.420 |
But you have your robot body in the brain, like Jeremy said, where your environment is basically not just the tools that it's using to complete the task, but also your inputs and outputs for the agents. 00:32:42.420 |
Any objects or state that it's working with, all of this is the environment. 00:32:47.420 |
And so we want to construct this environment and then plug in the LLM, which is our brain and say, here's your environment, here's your task slash prompt and complete the task. 00:32:59.420 |
And so this is the environment we put together where the assignment is a string input. 00:33:07.420 |
So we have this cool kind of way of declaratively building your prompt, right, where our assignment is the assignment to complete, but this workspace input is a workspace with tools to edit and test code. 00:33:19.420 |
So now that our agents, when we connect these things, will see this as the description of this thing they can use and say, okay, we're building up this prompt by annotating our code basically. 00:33:32.420 |
And so with this workspace input thing, that's referring to the sub module we just created. 00:33:41.420 |
So if we called that something else like Foo workspace, and we installed that, this would be with Foo workspace input, right? 00:33:47.420 |
We're dynamically generating all of these functions for the environment type to say, any objects of my dependencies I can have as an input or an output of my environment. 00:34:01.420 |
And so we noticed that we also have a workspace output, which is the completed task because all objects in Dagger are immutable. 00:34:11.420 |
And so I give it an object, it's going to do a bunch of things and give me back a different object that's completed task. 00:34:19.420 |
And maybe that's like a boring detail, but the main thing is the thing I passed in is still going to be the same, but it's going to have a new version that's given me back called completed. 00:34:30.420 |
I mean, I think a lot of people are dealing with this kind of stuff now, right? 00:34:33.420 |
With the different APIs and like doing a bunch of JSON parsing and validation, right? 00:34:38.420 |
And trying to-- you know, there's different frameworks doing it in different ways. 00:34:41.420 |
But you could just think of it as this is our way of saying like, here are the typed inputs. 00:34:47.420 |
We're expecting a typed output back in the end. 00:34:51.420 |
And this gives us a way to ensure that we're getting what we actually asked for. 00:35:00.420 |
So we have the environment and the prompt and we give both of those the agent basically. 00:35:05.420 |
So the prompt, I believe, is just a bit lower here if you're following along here. 00:35:11.420 |
So it wants you to create a dot dagger slash develop prompts dot markdown. 00:35:17.420 |
So I'll just open it again over here on my editor. 00:35:23.420 |
And so we're saying you're a developer on this project. 00:35:26.420 |
You're going to get an assignment and the tools to complete it. 00:35:32.420 |
And so this is basically-- it's going to be templated in by the assignment in our environment. 00:35:40.420 |
So it's going to drop that right in that prompt so that the agent itself doesn't have to go read this other variable in its environment and knows, OK, my assignment is make this cool new feature. 00:35:51.420 |
And then we have a bit of prompt structure here, right, where if you've built a lot of these agents, you've probably kind of refined how you build your prompts and what those structures look like. 00:36:01.420 |
This is a really simple agent, so it doesn't have a ton of structure. 00:36:04.420 |
But we do say before you write code, make sure you analyze the workspace to understand the project structure. 00:36:11.420 |
So it's not just going to create some garbage and be like, cool, I made this new file, but I didn't look at the project first. 00:36:17.420 |
Don't make unnecessary changes because sometimes you'll see, especially certain models without the right constraints, will go make the change you asked for and then change for other things and be like, cool, it looks good, ship it. 00:36:32.420 |
So we do have to ask it to run the test once it's made those changes. 00:36:37.420 |
So it's not just going to see the test function and be like, oh, I should probably call that. 00:36:41.420 |
We want to make sure to tell the LLM, like, OK, you have a tool that can validate the code you're writing. 00:36:48.420 |
And then don't stop until you've completed the assignment in the test pass. 00:36:52.420 |
So this is telling it, you know, keep working until you've satisfied what I asked it to do and the test pass. 00:37:09.420 |
Because it's like, what we find, we end up running evals on these things, right? 00:37:13.420 |
Where we'll try different LLMs plugged in and then we'll iterate on the prompts until we're getting the results, the consistency we want across the different ones. 00:37:24.420 |
And, yeah, it comes from experience of knowing how they veer off track and et cetera, how we're writing these. 00:37:31.420 |
Yeah, and that's like what I mentioned earlier, like using something like Dagger Cloud to be able to see the visualization of all the work the agent's doing. 00:37:40.420 |
If I'm frequently seeing like, okay, the agent's just calling write file and then returning, I know that, okay, I have to tell it to look at the code, I have to tell it to test the code. 00:37:49.420 |
And that's going to be different for every model. 00:37:51.420 |
And especially like the prompt structure is different for different models. 00:37:54.420 |
Is it possible to implement things like reflection agents as well to police each other? 00:38:01.420 |
So the question is like, can you implement like reflection agents to police each other? 00:38:04.420 |
And that's something, I probably have an example of that I can show at the end if we have time. 00:38:08.420 |
But yeah, like, remember, with this, each agent is just a Dagger function. 00:38:14.420 |
And so you can create all these agents layered on other agents. 00:38:18.420 |
And even in your environment, you can actually put an agent in the environment and say, hey, you have this agent at your disposal if you need it to do something, right? 00:38:30.420 |
But it's like similar to the concept of like Google's A2A where you say, if you're not familiar with that, it's basically this structure where you tell an agent, listen, you can do these things. 00:38:43.420 |
But you also can talk to these other agents, and that's what each of these other agents do. 00:38:47.420 |
And so if you need to, you can reach out to them and say, hey, other agent, I need you to tell me how to write TypeScript. 00:38:57.420 |
It's all just piecing functions together, right? 00:39:00.420 |
It's just the same code we've always been writing, but now there's an LLM component. 00:39:05.420 |
So now this line right here, line 94, most important line of the workshop because this is the agent where we've actually taken our Dagger client and .llm. 00:39:14.420 |
So this is another type within the Dagger client. 00:39:17.420 |
You should make it bigger just for a second, you know? 00:39:20.420 |
So it's off the screen, since it's so important. 00:39:22.420 |
I feel like it's not even getting that much bigger. 00:39:29.420 |
So like, we've said, all right, from the Dagger client, we need this LLM type. 00:39:38.420 |
So now we've got this thing work that is a Dagger LLM with these things. 00:39:48.420 |
If you need your pictures, you can get one with Kyle and commemorative. 00:40:00.420 |
Like, that's literally, because we've asked it, like, we've said in this prompt. 00:40:12.420 |
And so now this work variable in our code is the completed work. 00:40:18.420 |
And so from that work, we can look back at the environment in that and say, 00:40:24.420 |
Do you remember in our environment, we defined a workspace output called completed. 00:40:39.420 |
And so from that workspace, we want to grab the completed directory, 00:40:43.420 |
So if you remember in our workspace object here, it has an attribute called source, which 00:40:51.420 |
And so this is all like a few layers of complexity. 00:40:55.420 |
But we've said in that workspace, we have a source thing. 00:40:58.420 |
And ignore the node modules folder because maybe that's going to break in my machine. 00:41:03.420 |
And then now that we've got that, just to make triple sure. 00:41:07.420 |
Because remember, I mean, we did tell it three times to run tests. 00:41:12.420 |
And in our code, we're saying, all right, now run the test. 00:41:15.420 |
Because this is all the same code that we're using throughout our project to run tests. 00:41:22.420 |
And if that fails, you could maybe kick it back to the LM and say, hey, this failed. 00:41:29.420 |
So that's like trying to put the agents on rails or give them guard rails, whichever metaphor 00:41:35.420 |
But it's like, you know, that's pretty key because we're trying to like let them do the 00:41:40.420 |
creative stuff they do, the generative stuff they do, like write some code for us. 00:41:43.420 |
But we need to enforce certain standards, right? 00:41:49.420 |
So we don't just dump that garbage back to your machine. 00:41:54.420 |
And remember, all these changes that it was making as it's iterating on these things, that 00:41:59.420 |
It's not just changing your file system as it's doing its work. 00:42:02.420 |
And that's a key thing, too, because now maybe you have 10 of these agents running. 00:42:07.420 |
They all have their own sandboxed workspace where they're editing these files. 00:42:13.420 |
And before we do mess up our local state, we triple check that the tests pass. 00:42:18.420 |
And then we say, okay, return that completed directory. 00:42:20.420 |
And so now this function, and we'll just triple check here on the guide side. 00:42:27.420 |
We say Dagger functions, and we have this develop one that shows here. 00:42:33.420 |
So now if I go into Dagger shell, which is hopefully what it asks us to do, it is. 00:42:38.420 |
I wrote this, so we're just checking myself here. 00:42:45.420 |
Now before I do that, one thing I don't think I called out at the very start here was that we had to configure an LLM provider. 00:43:00.420 |
Olama, Docker model runner, like literally any. 00:43:06.420 |
So you do have to configure some environment variables to be able to, for Dagger, to make API calls to that. 00:43:23.420 |
Configuration slash LLM shows all the different options on how to configure things. 00:43:30.420 |
I'm just going to type something really scary. 00:43:38.420 |
So Dagger also has cool secrets provider integrations. 00:43:42.420 |
So I don't have my actual API key echoed there. 00:44:04.420 |
So now when I say Dagger, it's going to take a second to spin up. 00:44:10.420 |
And this is the part where if you're struggling a bit with Wi-Fi, this might be a bit tough. 00:44:16.420 |
Because if you are following along, we're going to push this to GitHub in a second. 00:44:27.420 |
So now if I say LLM pipe model, for example, where you see my little one password prompt. 00:44:36.420 |
It's going to take a second to think about it. 00:44:38.420 |
And so with each model provider, we have a default model. 00:44:45.420 |
But right now, by default, it's going to use Cloud 3.5. 00:44:48.420 |
So maybe we're not going to get the best results. 00:45:13.420 |
We have an optional argument source, which, again, it's just going to be my repo. 00:45:17.420 |
And this is going to give us back a directory. 00:45:21.420 |
I just say develop and then do the assignment. 00:45:25.420 |
And then we didn't actually look at the project we're dagorizing yet. 00:45:32.420 |
So let's ask it to, I think, in here we say the example thing is to make the main page blue. 00:45:41.420 |
And I'll say make the main page say hello workshop people. 00:45:59.420 |
And the little robot head of the model, which is Claude 3.5, Sonnet, saying, cool, let me do these things. 00:46:06.420 |
And we can actually see it calling tools, right? 00:46:16.420 |
And so it figured out, okay, I can look at my files. 00:46:19.420 |
Now here's this specific file I might need to edit. 00:46:27.420 |
And while this is running, let me just open up cloud. 00:46:32.420 |
So we can actually see the cloud visualization of this. 00:46:45.420 |
I think my Wi-Fi is failing me on this auth flow. 00:46:49.420 |
But while it's running, we'll just watch this. 00:46:54.420 |
So that you're getting streaming to your terminal UI and the web UI. 00:46:58.420 |
So we see it call write file with some new file contents. 00:47:01.420 |
And now it says, now that we've made the change, let's run the tests. 00:47:05.420 |
And this is the part that really might fail on this Wi-Fi. 00:47:07.420 |
Because it's doing an npm install and downloading a bunch of npm modules or node modules. 00:47:15.420 |
We'll just let it go and we'll talk through it. 00:47:17.420 |
But we can see that our agent is actually, it wrote the files. 00:47:21.420 |
And then it's writing, it's running the tests. 00:47:34.420 |
So we see it's saying like with exec npm install, with exec npm run test unit. 00:47:40.420 |
If we go back to our workspace in our test function, that was part of it. 00:47:48.420 |
So this is like the agent just had to call test. 00:47:51.420 |
And we've defined what happens when you call test. 00:47:57.420 |
Sometimes you're like, you know, make sure to test it. 00:47:59.420 |
And it's like, I'm going to try a pie test with these crazy options. 00:48:02.420 |
And you're like, why did you think that was going to work? 00:48:04.420 |
So instead you just give it exactly what it should be. 00:48:06.420 |
We could give it more flexibility in how it runs things. 00:48:09.420 |
But in this case, we already know this is how you run tests in the project. 00:48:16.420 |
That's probably the biggest thing in creating reliable agents with Dagger is giving flexibility 00:48:23.420 |
where it's important for completing tasks and removing it where you know exactly how things are meant to happen. 00:48:30.420 |
So it doesn't need the freedom to just run any command in a container. 00:48:34.420 |
We know, okay, all you need to do is modify files and run this test function. 00:48:39.420 |
And for more complex agents, maybe there's some other functions there too. 00:48:44.420 |
But for this one, like, this is the amount of freedom we've given it. 00:48:59.420 |
And so this is like the visibility that we want to see when we're running these agents. 00:49:04.420 |
And we saw the assignment is to make the main page say, "Hello, workshop people." 00:49:11.420 |
Now, Cloud 3.5 is looking at this and saying, first, let's look at what objects we have and 00:49:16.420 |
check out the workspace, make the changes, and then run the tests. 00:49:20.420 |
It runs list objects, which lets it see what it has in its environment, which is, like, 00:49:30.420 |
And then it's going to say list method, so it's going to see what it can do with the workspace. 00:49:40.420 |
And so this is, like, this kind of visibility into the agent's environment where we say, 00:49:44.420 |
"Oh, there's this workspace write file function that gives it back a workspace type." 00:49:50.420 |
Oh, you mean, so we didn't have to write any of the JSON kind of, you know, description 00:49:57.420 |
So we just gave it that Dagger module, and then it all got wired up into the agent's 00:50:11.420 |
And remember, the way that it does that in our workspace code was it creates, like, an Alpine 00:50:17.420 |
And so we can see the tracing of that, too, which is, like, the underlying actions of the 00:50:24.420 |
But we also see the return of that, which is what the agent sees. 00:50:34.420 |
To make it say that, we should probably modify this one or this one. 00:50:43.420 |
It's going to see this whole file of the word "helloworld.view." 00:50:49.420 |
And it says, "Okay, I don't think that was it. 00:50:54.420 |
And then eventually it says, "I see that that app.view uses the hello world component." 00:51:04.420 |
It's going to change app.view to pass a different message to it. 00:51:18.420 |
So we're going to push it to GitHub in a second. 00:51:24.420 |
So this is the part that it's currently at in my shell. 00:51:26.420 |
Where it's been running for like five minutes. 00:51:29.420 |
So yeah, that's the visibility part I'm talking about. 00:51:31.420 |
Where we can see exactly what the agent sees and what's happening under the hood. 00:51:41.420 |
And yet, it's all inside that Dagger engine in containers. 00:51:49.420 |
So Dagger Cloud is just showing me the visualization. 00:51:53.420 |
This is on my machine, which is why it's still running. 00:51:57.420 |
And this is like because of the connection we have. 00:51:59.420 |
And because of, you know, whatever the load we're putting on it. 00:52:02.420 |
But the other thing to think about is it could be like we're using Python here. 00:52:12.420 |
But the workflows that Kyle's writing are in Python. 00:52:17.420 |
You could have a laptop, say, or any server that just has Dagger and a connection to the internet. 00:52:27.420 |
So that's why the environments, environments is not just for the agent and developer. 00:52:33.420 |
So you could have a brand new laptop with just Dagger. 00:52:36.420 |
And it would because it's using Python runtime container for the workflow he wrote in Python. 00:52:46.420 |
You don't need to struggle with VMs or any other versions or whatever. 00:52:51.420 |
And then inside of that somewhere, there's Node container that happened, right, in order to create this environment. 00:52:59.420 |
And that, again, it's all just nested inside of there and cached and everything else automatically. 00:53:06.420 |
So you can kind of just do this with a very bare bones machine set up and everything will just work. 00:53:13.420 |
So what we can see that we probably won't get to run this part locally just because I don't, we'll come back to it if it finishes. 00:53:20.420 |
But anyway, I'll just describe this flow here where we say, okay, we're in shell. 00:53:25.420 |
So that happens, like we ran that develop thing and it gave us back something. 00:53:30.420 |
But now in Dagger, like I keep saying, we're in shell Dagger. 00:53:34.420 |
When you type Dagger and get into that, this view, it is a shell just like bash, right? 00:53:41.420 |
Where we can actually do things like create variables and chain things together. 00:53:46.420 |
And so what we could do if this finished is say, okay, let's actually save that, the output of this thing. 00:53:57.420 |
And then we could pass that to our other functions. 00:53:59.420 |
Because remember, they, they default to using our get source from our machine. 00:54:03.420 |
But we could, we could pass in that optional directory to all of our functions to say, use this directory instead. 00:54:09.420 |
So now I could actually run the whole thing as like a local, like I could see the results of this before even saving it to my machine. 00:54:19.420 |
I don't know why I keep any of this folder, but we'll go to the correct directory. 00:54:33.420 |
because what, what I can do is I can run the output from the agent as like, I can run the whole site. 00:54:42.420 |
And I can see what it's built before even save it back to my disk to say, yes, this is a good solution. 00:54:48.420 |
So once we get this connection here, just waiting on pipes to connect to each other, and we'll, we'll let that run for a second. 00:55:01.420 |
But the main thing is we can pass this around. 00:55:04.420 |
We can run all of our functions with that completed directory. 00:55:07.420 |
And then finally say, all right, we say export. 00:55:10.420 |
That saves it back to your disk and we're done. 00:55:13.420 |
So the next step is, all right, we're good with that. 00:55:15.420 |
We know how to use this agent locally to ask it to make cool tasks. 00:55:20.420 |
But my, my people requesting features on my site, they don't have this installed. 00:55:26.420 |
They don't have Docker and Dagger installed on the machine. 00:55:31.420 |
They just want to go to GitHub and say, make this new feature. 00:55:35.420 |
And it sounds ambitious, but it's really quick. 00:55:38.420 |
So we've got plenty of time to look at the solution here. 00:55:44.420 |
And so the first thing we're going to do is actually install another dependency from the Daggerverse. 00:55:55.420 |
But we saw it installed earlier when you showed us that Dagger JSON. 00:56:03.420 |
So if we search for that, and we have this module called GitHub issue, it's got a bunch of functions that let us do things with GitHub issues, like we can list GitHub issues in a repo, we can list the comments on a particular issue, we can write comments, we can create pull request comments, all kinds of things with GitHub issues and GitHub pull requests. 00:56:29.420 |
So with this module where I've just basically used the GitHub Go SDK in this Go module to connect my Dagger functions to the API calls, I can install this in my Python project. 00:56:43.420 |
And now I can have the ability to work with GitHub issues. 00:56:49.420 |
And so we create, we add another function to our code called develop issue. 00:56:55.420 |
So remember, we created develop, now it's develop issue. 00:56:59.420 |
And all this is going to do is say we have a GitHub issue out there with our feature requests. 00:57:03.420 |
We want to read that GitHub issue, give it to our agent, the agent's going to do all its things, then give us back a directory. 00:57:09.420 |
We're going to take that directory and make a pull request. 00:57:12.420 |
Oh, so like really similar to like the assignment that we gave it, instead, it's going to be reading the GitHub issue. 00:57:18.420 |
And instead of just getting the directory back ourselves, we put the directory into a PR. 00:57:24.420 |
And so this is the entire thing here where we're not writing a new agent to do this. 00:57:29.420 |
We're just, we're wrapping it with some other pieces to say, go here to get the assignment. 00:57:34.420 |
Once it's done, put that completed work over here, which is the from here was like read a GitHub issue. 00:57:42.420 |
And then we get that assignment and I can open the editor. 00:57:48.420 |
Okay, so we get that GitHub issue from that issue. 00:57:56.420 |
We pass that to our develop function because this is our agent and say, here's your assignment. 00:58:02.420 |
Here's the source, which came from that same defaulted input argument. 00:58:08.420 |
And then we get the issue title and URL, which is going to be really cool because then we actually, in GitHub, automatically have the new pull request linked to the GitHub issue. 00:58:22.420 |
Just by having this, the body say closes this issue. 00:58:28.420 |
And so this whole thing, like you can run this part locally too. 00:58:33.420 |
But it just takes the GitHub token and an issue and the repo name so it knows where to put the PR. 00:58:53.420 |
The first two things we need to do is in the repo, we need to create two repo secrets. 00:59:03.420 |
But if you want to see all those things happen in Diger Cloud, you just put that token in the environment. 00:59:07.420 |
And then whatever LLM key you're using, so the same one I used locally, is going to be in that repo secret. 00:59:13.420 |
So if I go over here in my repo and I say, and I zoom out a bit so I get all the buttons, I say settings. 00:59:23.420 |
And then down here under Secrets and Variables, Actions, I have two repo secrets here that we just saw from that screenshot. 00:59:39.420 |
And then there's one more thing, which is, let's see, that's how we get our Dagger Cloud token to paste in there. 00:59:48.420 |
There's a little checkbox we have to press over here to let GitHub Actions create PRs, because that's disabled by default. 00:59:57.420 |
So if I go under, okay, under Actions, General, and then at the very bottom, there's this checkbox, allow GitHub Actions to create and approve pull requests. 01:00:18.420 |
This is a thing you can copy paste, and I'll open it up over here. 01:00:29.420 |
If you ever haven't used GitHub Actions, I'll explain this real quick. 01:00:32.420 |
But it's basically a CI platform, and we have, with this configuration, we tell it when events happen. 01:00:43.420 |
So in this case, we say, when a GitHub issue is labeled, and the label is called develop, then run this command. 01:00:52.420 |
And this command is the Dagger call develop issue with those arguments, like GitHub token, the issue ID, and the repo. 01:00:59.420 |
And these things are all coming from GitHub Actions automatically. 01:01:02.420 |
So, like, the environment's GitHub token is created here, where we say, this command needs a GitHub token with permissions to write contents. 01:01:23.420 |
We've given it the API key for our LLM and the cloud token. 01:01:27.420 |
And so now, just by running this Dagger call, that connects the dots where GitHub Actions, whenever we create that label, is going to run that Dagger function. 01:01:36.420 |
And that Dagger function has all the capabilities to run the agents and open a PR. 01:01:40.420 |
So that's, like, us in the Dagger shell when we call, when we are running, like, the develop function or some other build function or whatever. 01:01:48.420 |
This is just having GitHub Actions run the develop issue function for us. 01:01:57.420 |
So we're having GitHub Actions do it because we want this flow to be automated inside GitHub. 01:02:08.420 |
So this just happens to be GitHub Actions because we're already in a GitHub repo. 01:02:13.420 |
It's free because this is, like, we're not using any crazy compute to run this thing. 01:02:19.420 |
And most of the hard stuff is happening on your LLM that you're paying for somewhere else. 01:02:22.420 |
And they have better internet connection at GitHub than we did today. 01:02:32.420 |
And we want to -- what did we ask for before? 01:02:35.420 |
We asked for, like, make the main page say -- 01:02:48.420 |
And remember, this whole thing kicks off when I add the label develop. 01:02:52.420 |
And so I've already run this on this repo and obviously made a typo as well at one point. 01:02:57.420 |
But if you don't have it there, you can just say foo. 01:03:00.420 |
And you'll have a button to say create a new label develop. 01:03:12.420 |
And so now that kicks off GitHub Actions to call my dagger thing. 01:03:27.420 |
Because remember, I put that cloud token in there. 01:03:29.420 |
Because this stuff is all too hard to see flying by my screen in real time. 01:03:37.420 |
But it could be any kind of, you know, orchestration. 01:03:49.420 |
How much, if any, like prompt modification for you guys? 01:03:52.420 |
Is it literally just what's in that one markdown file? 01:03:55.420 |
Or do you add, like is it aware that it's in Dagger? 01:03:59.420 |
So we have, the question is like how much prompt modification does the agent have? 01:04:04.420 |
Dagger has its own system prompt that it adds. 01:04:06.420 |
That kind of guides it towards like how you use tools within Dagger. 01:04:10.420 |
So it knows like call the select methods and list functions and those things we saw it doing. 01:04:18.420 |
You can get rid of that system prompt if you want to. 01:04:23.420 |
You have to make further edits because the agent is not able to develop the right code analogies. 01:04:31.420 |
How do we correct after the developer before the rest of the stuff starts? 01:04:36.420 |
So if the agent does something, if it calls develop and it runs and it produces something that we say, 01:04:41.420 |
How do we go back and say, make these changes? 01:04:51.420 |
So you can edit the completed source if you want. 01:04:54.420 |
If you see the source and say, oh, it needs one more change. 01:04:57.420 |
Or I can show you another function where we say, we have an ability to give it more feedback 01:05:05.420 |
Here's some more changes to make because you didn't get it quite right. 01:05:29.420 |
So the question was giving the agent access to the test directory. 01:05:36.420 |
And I think in our workspace, we just give it the full source. 01:05:42.420 |
So it could get down in there if it wanted to. 01:05:46.420 |
I think it's kind of a funny thing, like making sure the tests pass, because sometimes if the agent broke the test, 01:05:56.420 |
Sometimes we actually change the behavior and the tests need to be updated. 01:06:01.420 |
So you might want to maybe have that as part of your prompting or part of your validation. 01:06:05.420 |
Say, make sure the agent didn't change the test. 01:06:08.420 |
Or it's kind of tough to decide, like, whether that's correct or not. 01:06:14.420 |
I noticed that there wasn't a Dagger install step, because I've got behind this little action. 01:06:27.420 |
But it's really just -- there's a Dagger for GitHub action. 01:06:36.420 |
But this installs Dagger in your GitHub actions runtime, basically. 01:06:46.420 |
The dependencies, like when you install the dependencies, 01:06:57.420 |
So this is -- in our Dagger JSON, we have all of our dependencies listed. 01:07:04.420 |
And so you don't have to say, like, Dagger install or anything. 01:07:07.420 |
When we say Dagger install, it adds it to this. 01:07:12.420 |
We don't have to do anything like NPM install like that. 01:07:15.420 |
It just -- it knows to make sure your client's generated. 01:07:19.420 |
But that is the nice thing about having those dependencies, you know, 01:07:24.420 |
in the -- in a file saved in Git, you know, alongside the project. 01:07:29.420 |
So because, like, what we've done essentially, like, 01:07:32.420 |
when we first got this project, this view app project, 01:07:38.420 |
It was just, like, an app that you could run. 01:07:40.420 |
And then we said, oh, well, let's Dagger init in this thing. 01:07:45.420 |
where we started developing our build and test functions, right? 01:07:48.420 |
Kind of like our tools for development or for CI, just alongside. 01:07:52.420 |
And then in there is where we've been installing more modules, 01:07:56.420 |
like the workspace, the GitHub issues module, like anything else you would need. 01:08:04.420 |
So the thing's now like this fully loaded, like, Daggerized project. 01:08:07.420 |
So it's kind of carrying around its own tools on its back for working, you know, 01:08:13.420 |
just for a developer to use, or platform engineer to use, or for an agent to use. 01:08:19.420 |
Yeah, we're just, like, waiting for things to load here. 01:08:23.420 |
Have you gotten anything, like, Dagger and Dagger, 01:08:27.420 |
where you have it spinning up, like, agent fleets, maybe different roles? 01:08:34.420 |
So that's -- I mean, myself as someone that builds a lot of Dagger code, 01:08:37.420 |
I have agents that need to write Dagger code. 01:08:40.420 |
And to reliably validate those things, they need basically Dagger inside of Dagger. 01:08:46.420 |
So that's exactly, like, a thing that you can do, and I can even pull up if we go to -- 01:08:52.420 |
we're a bit short on time, but we're basically done with that guide. 01:08:58.420 |
But we have an examples thing here on the docs. 01:09:05.420 |
But one of the really cool ones that I like the most because -- 01:09:11.420 |
I thought you were going to show mine, but that's fine. 01:09:25.420 |
There's this repo under my GitHub, kpenfound/daggerprogrammer. 01:09:28.420 |
And this thing is something I use to -- like, in the docs, we saw those tabs of all the different 01:09:35.420 |
And so every -- whenever I write a new guide, I have to have it in five languages. 01:09:38.420 |
And so this agent can take it in one language and produce all the languages. 01:09:43.420 |
And that's just an agent that knows how to write Dagger. 01:09:46.420 |
And so to do that, it has a lot of cool things in addition to be able to, like, run Dagger 01:09:53.420 |
So if we look at the code for that, it's just like -- 01:09:59.420 |
And when it runs tests, it runs the Dagger thing. 01:10:04.420 |
And there's this flag privilege nesting so that the inner container can talk to the engine. 01:10:18.420 |
And will you use Dagger to implement MCP servers? 01:10:22.420 |
Because you have all these modules, which maybe you could imagine having multiple MCPs as a 01:10:29.420 |
So one way to think about it is we were kind of doing this thing with Dagger modules before MCP came 01:10:37.420 |
And then obviously we're like, oh, this is super aligned with the way we think things should 01:10:44.420 |
And so you can today even take a Dagger module and you can say Dagger dash M, the name of 01:10:53.420 |
And so you can expose a Dagger module as an MCP server, for example. 01:10:58.420 |
And yeah, and we've got some more things that we'll be probably sharing soon about that kind 01:11:04.420 |
But yes, we think the vision is compatible in that way. 01:11:07.420 |
And yeah, you can use the MCP ecosystem as well. 01:11:12.420 |
Yeah, so there's a few different layers here, right? 01:11:14.420 |
There's within our agent that we just built, we installed modules and that uses basically 01:11:20.420 |
our internal implementation of MCP to talk between modules within Dagger. 01:11:24.420 |
But you can also take a Dagger module, expose it as an MCP server, and then in, I don't know, 01:11:31.420 |
the near future, next week or something, you can connect to external MCP servers to bring 01:11:38.420 |
I mean, I wanted to be clear, like the internal, the internal implementation, it's before MCP. 01:11:44.420 |
So it's not MCP per se, but it's very much logically, you can think of it in a similar way. 01:11:51.420 |
And because you can expose everything as MCP servers, it ends up being practically, you know, 01:12:03.420 |
So we got our PR open, says, make the main page say that, closes that issue we created. 01:12:08.420 |
We have that commit pushed up and we see the user is this GitHub action spot. 01:12:12.420 |
And we have on the welcome.view, it changed from documentation to, so maybe that's right. 01:12:20.420 |
Oh, it deleted this other thing too, because it decided that's not needed. 01:12:28.420 |
It's just vibing back there, you know, the agent's just like, yeah. 01:12:31.420 |
But yeah, the main thing is we were able to get it to run in GitHub. 01:12:34.420 |
So I was able to request that feature, and it ran hands-free. 01:12:41.420 |
So right now, we only built in the one thing where it says we create an issue that's a feature 01:12:47.420 |
But if we look at, I think, on this examples list, we have this one, this Greetings API, 01:12:59.420 |
There's, like, five different agents in here. 01:13:01.420 |
And one of them is I want to give feedback on a PR. 01:13:16.420 |
So this one, the original one is, like, make a new endpoint for my API. 01:13:25.420 |
And then it picks up again and pushes some new changes. 01:13:28.420 |
And then I have another agent where I say slash review. 01:13:30.420 |
And that will create a review for my PR with any other changes that I need. 01:13:34.420 |
And then I can say, okay, make those changes. 01:13:36.420 |
And then also, please don't delete all the tests. 01:13:40.420 |
And that could be, like, you don't have to be inserting yourself at every one of those points, 01:13:47.420 |
Because it's great for when we're showing people... 01:13:49.420 |
If you want an example of how you can take what that workshop just built to the next level, 01:13:53.420 |
where you have all this feedback and more advanced things, this is a great repo to look at, this 01:13:58.420 |
Because it has all of these different agents doing tons of things. 01:14:04.420 |
If I, as a human, push up a broken thing, because we still have humans developing stuff sometimes, 01:14:11.420 |
So I pushed a broken thing and the test failed, which is super annoying because I, you know, 01:14:17.420 |
I skipped running tests because I didn't have a good prompt that told me to run tests three times. 01:14:21.420 |
This agent can actually look at the test failure automatically and propose a fix for that, 01:14:27.420 |
that I can just click on it and fix that test change, right? 01:14:31.420 |
So this is all stuff in this demo repo where you can see, like, how to build all these things yourself. 01:14:41.420 |
I just had a question almost getting at the motivation for some of this stuff. 01:14:45.420 |
It feels like there's a world where Dagger could have really prioritized just, like, the containers, 01:14:48.420 |
the workflows and let you just bring your own AI agent. 01:14:51.420 |
Like, what's the motivation behind making it its own primitive and going down that path? 01:14:55.420 |
I think there's a lot of levels to it, right? 01:14:57.420 |
Like, if you're already really baked into, like, Pydantic or OpenAI Agents SDK, 01:15:02.420 |
you can still use those container workflows in that. 01:15:09.420 |
If you've done the OpenAI Agents quick start, if it loads here... 01:15:15.420 |
Or, sorry, this is the agent quick start we have, but with the agent SDK, where I've used 01:15:22.420 |
the OpenAI Agent SDK that says, like, here's my completions model. 01:15:31.420 |
But in that SDK, I'm actually still using Dagger. 01:15:34.420 |
So I actually recreated that same workspace where you have read file, write file, and build. 01:15:40.420 |
But I've created that with Dagger inside of the OpenAI Agents SDK. 01:15:44.420 |
So I'm using their agents, but using Dagger code for the containers. 01:15:49.420 |
The main thing is, like, this, I had to write all of these tools and how to use them. 01:15:59.420 |
If it's all within Dagger, you get that cool thing where we have that whole Dagger versus 01:16:03.420 |
I can just plug one in, and that's just given to the agent, right? 01:16:06.420 |
Yeah, your whole method signature is instantly translated into the right form with tools, right? 01:16:12.420 |
You get tools for free, as well as functions. 01:16:15.420 |
And, yeah, and we do have some people in our community that are using Dagger. 01:16:19.420 |
They're, like, with Pydantic and other things where they're just, like, they want this sandbox 01:16:23.420 |
capability because they're, like, oh, I don't want to, you know, they don't want to use another 01:16:29.420 |
I want to have it locally, but I don't want it on my computer in my file system either. 01:16:37.420 |
But I think, yeah, the sweet spot is kind of doing it all because it just harmonizes really well. 01:16:46.420 |
Let's say if I want to build an agent for programming HTML games. 01:16:52.420 |
So for that game building agent, I would need the testing, so the running and testing 01:16:59.420 |
So does Dagger have those sort of constructs, like, let's say if I want to spin up a browser 01:17:04.420 |
environment and then do some kind of automation in that for testing that game which the LLM might 01:17:12.420 |
I mean, I've done, I've done some headless browser stuff. 01:17:13.420 |
I've also done some browser stuff and then connect over VNC or different. 01:17:19.420 |
You can do a lot of, you know, you can do, you can do a lot of stuff with Linux containers. 01:17:33.420 |
And thanks for compressing a lot of information. 01:17:37.420 |
So, is my understanding that you build CI/CD infra and all these things once and then let Dagger 01:17:44.420 |
do the asynchronous job with guardrails and, you know, all the things in place? 01:17:49.420 |
Like, is my understanding that Dagger is sort of this asynchronous AI agent that does things 01:17:54.420 |
on its own but with guardrails, not just leaving a cloud code or something in a trust-all mode 01:18:04.420 |
I think, yeah, so the question is, like, yeah, what is Dagger in a certain sense, too? 01:18:09.420 |
But Dagger gives you this platform to create these software engineering workflows that can 01:18:14.420 |
be used for shipping software, that can be used for developing software, you know, and the environments 01:18:19.420 |
And then you can use them as a platform engineer or as a developer, but then you can also hand 01:18:26.420 |
And so we think that's really powerful, the fact that you can use that same platform to do 01:18:30.420 |
all those things and to create those guardrails, like you say. 01:18:35.420 |
The one thing I want you to show, and you've got one minute, can you just show your terminal 01:18:38.420 |
and just let's get vibey for just one second. 01:18:41.420 |
So you're connected to an LLM right now, right? 01:18:44.420 |
So go ahead, just like, let's talk to this LLM. 01:18:47.420 |
So it turns out that we've been using the shell mode, which lets you, you know, kind of like 01:18:53.420 |
very declaratively say, like, I want container from Alpine with this file and give me a terminal 01:19:00.420 |
Now, what we've done is we just had, we're like, we're chatting now directly with the 01:19:06.420 |
And this LLM can see all the Dagger objects you have. 01:19:10.420 |
So another way you can use Dagger is you can just say like, all right, I'm just going to 01:19:15.420 |
And I'm going to say, hey, LLM, you see that container? 01:19:20.420 |
So you can get that kind of, that kind of workflow going too. 01:19:25.420 |
So he's actually saying like, hey, give me a Python container. 01:19:28.420 |
And so it's going to actually look and see what methods exist in the Dagger API. 01:19:34.420 |
And it's, oh, there's this container method in the API, which we were using earlier. 01:19:38.420 |
And then it's going to like, you know, decide, oh, I'm going to use container, maybe from container 01:19:45.420 |
So these are just, it's exploring the Dagger API right now. 01:19:49.420 |
And now it's going to like, it's actually pulling a Python 3.11 container, then it can do things 01:19:56.420 |
So, you know, it's actually using containers, like kind of like computer use or something 01:20:01.420 |
But so yeah, so you can get, you can go, we didn't even show that side of it. 01:20:06.420 |
Because, you know, we're trying to show the guardrails. 01:20:09.420 |
But you can also use it in this kind of a style too. 01:20:15.420 |
Typically, LLMs are good at small to medium tasks. 01:20:18.420 |
And that's what we have seen, like a small to medium task here. 01:20:21.420 |
How good is Dagger at orchestrating like a large task, which needs design or some user 01:20:28.420 |
input or, you know, multi turn prompt, like, you know, not a small medium task, but a large 01:20:35.420 |
Yeah, the question is like, size of tasks that Dagger is good for. 01:20:38.420 |
I think if you make it, if you decompose things down and you can architect things right, it can 01:20:46.420 |
So we're going to like, we're going to end here. 01:20:48.420 |
We'll take some more questions outside the room. 01:20:50.420 |
Thank you so much for everybody that attended.