back to index

Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger


Whisper Transcript | Transcript Only Page

00:00:00.000 | Okay, we're going to kick this off. We're trying to sort out some internet options here,
00:00:19.640 | but in the meantime, I'll give us our little intro. So first of all, I'm Kyle, this is Jeremy,
00:00:26.840 | we're from Dagger, and you'll see more about what Dagger is through this workshop where
00:00:31.780 | we're going to build a cool suite agent, and we're actually going to deploy it to GitHub,
00:00:38.060 | and so even like worst case scenario, if we can't get things running locally when we actually
00:00:42.580 | push things to GitHub and see agents running GitHub, then that's going to be out of our
00:00:47.380 | internet hands, and it's all going to be really cool. So first of all, on the left side here,
00:00:53.680 | we have kind of where we're getting started from. So this is the documentation site where
00:01:01.700 | we have install instructions, so I'll walk through those real quick. And then our quick
00:01:08.020 | starts where we're actually going to walk through these as like the content of this workshop.
00:01:13.460 | And then also a shout out to tomorrow night, we have a hack night at the CloudFlare office. It's on the external events list for this conference as well. But here's a QR code for it.
00:01:31.400 | So real quick. So there's a question about whether there's a Slack, I think? Slack for the workshop? Yes, absolutely. So if you go to the Slack, there's it says dagger dash workshop, ship agents, the ship, got it? Okay.
00:01:50.420 | Let me pull that up as well. So if there's questions, put them in there or raise your hand and Jeremy will get to you. Climb over people.
00:01:58.440 | I will. I will. I will. Yeah. Make it happen. Yes. I'll do my best. Awesome. So yeah, if everyone, if you're following along, awesome. If you can't like because you don't have desk room or can't get the internet or whatever, I'm going to walk through it live because I already have everything on my machine.
00:02:17.440 | And then you can always, you know, check back with this later on once you have a solid connection. So if you're not able to get out your computer or follow along, just watch me and I'll go through it and it's gonna be really neat.
00:02:30.440 | But if you are following along, here's the installation page on docs.dagger.io. So you can install from the homebrew tab or straight from our install scripts or with Winget.
00:02:46.440 | You can install the Dagger CLI. The only other dependency is that you need a container runtime such as Docker or Podman or Nerd CTL. So like anything that can run containers because Dagger itself runs its engine as a container.
00:03:02.440 | And I'll explain what that means in a second. But if you're following along, get started on this while I talk through a bunch of stuff about what we're actually doing and what all these technologies are trying to accomplish.
00:03:13.440 | So I think I'll pause you really quick. Yeah, just for the for the folks on the tech team. And so for some of you in the room, we're having we're finding the Wi Fi may or may not work for you.
00:03:24.440 | Use a hotspot if you got one if that works down in the basement. You're then you're amazing. Also, I'm trying to do a little something through the wired connection here.
00:03:32.440 | But for the tech team, it's requiring a password for me to use this service. So anyway, if you have that, slip me a note at some point.
00:03:42.440 | But otherwise, yeah, we're working on we're working on getting more connectivity as we speak.
00:03:48.440 | Awesome. Yep, there we go. So the QR is actually for the the hack night tomorrow night, the the docs and what we're going through our docs.dagger.io.
00:04:00.440 | So that's like the main content for what we're going to walk through.
00:04:04.440 | And I guess real quick, we can intro as well if you want to intro yourself first, Jeremy.
00:04:10.440 | Yeah, sure. I'm Jeremy Adams. I look after the ecosystem. I'm part of the ecosystem team that Kyle and I are both on and been at Dagger for a few years.
00:04:23.440 | And so I've got to see already a progression of folks using us for all sorts of things and most recently a lot around AI agent kind of workflows.
00:04:35.440 | But I love in this workshop, we're going to blend together some of the classic use cases we've seen with Dagger around CI and dev workflows as well as, you know, giving those to agents.
00:04:47.440 | Awesome. Yeah, I'm Kyle. I'm on the same team. Yeah.
00:04:51.440 | And I have a background in like DevOps and platform engineering, so much more on the kind of cloud infra side of things versus that building side.
00:05:02.440 | So it's cool to come at this from that perspective of, you know, trying to deploy agents somewhere and make things work.
00:05:09.440 | And that's why in this workshop, we're going to deploy things to GitHub because that's eventually what you're going to want to do when you build an agent.
00:05:17.440 | And you have to put it somewhere to run it. I can't just live on your machine all the time, I guess, depending on what agent is.
00:05:23.440 | Anyway, so if you made it this far, you've made it to the docs, then we're going to talk a bit about what Dagger is and why we're building agents with it.
00:05:33.440 | And so basically Dagger is, like I said, it's a container runtime.
00:05:39.440 | It's a workflow engine. And so people have historically done things like build their CI/CD with Dagger because you're building these pipelines that orchestrate containers, run all these tasks, and it runs the same on your machine as it runs in any cloud, like in, you know, in your Kubernetes, in GitHub, wherever your CI might run.
00:06:00.440 | But it runs the same everywhere. So you're making these workflows. And the cool thing is, that's also what agents are, right, is that they're just these processes where we have a bunch of tools we want to give to an agent.
00:06:16.440 | Anyway, okay, we're gonna see an action. And so, let's see, we have components, right? So Dagger itself is made up of core components like containers, like I said, also repos, directories, files, and now LMs are also a component that you have to work with, within this kind of toolbox of Dagger and how we're building things. And so it's just another building block, right? It's not a framework special, like, just built for
00:06:44.440 | making an agent, then you have it living next to your software. It's another component within your toolbox. And so you're bringing LMs into these existing workflows. And that's why it's a little bit different.
00:06:56.440 | Yeah, you could think another way of thinking about Dagger in a nutshell is Dagger is for software engineering workflows and environments. So you're going to see us building some environments, essentially some containerized environments with some functions, and all these things can become tools that, you know, human software engineers, AI agents, use for both development side, as well as app delivery side of things.
00:07:14.440 | So you'll see that kind of course these areas are all blending and kind of squishing together right now. We're seeing all this stuff happen in real time. So Dagger is kind of going to be one tool that you could use for that whole range.
00:07:36.440 | Yeah, so the question was, if Dagger is a tool to build containerized environments, what's the distinction between Dagger and Docker? So yeah, Docker has been around for a long time. And in fact, the founders of Docker are the founders of Dagger. And so we can think about the scope of the original scope of Docker was really about containerizing an application.
00:07:54.440 | And making that thing portable. So I can run on my laptop or Kyle's or up in Kubernetes or anywhere. So now what we're doing is we're taking a whole workflow and making that a portable thing.
00:08:15.440 | So yeah, there's definitely multiple containers and other types of objects, but everything's everything sandboxed by default. So you get so yeah, and we'll see as we get into it. Great question.
00:08:27.440 | Yeah, so yeah, we're writing code that is workflows itself. So that code can be go Python TypeScript, Java, PHP, we have all these different languages you can write with. And the cool thing is that you're not kind of choosing your language for Dagger and then that's the world you live in.
00:08:44.440 | Dagger has this cross language interop. So if I write a cool Dagger module with oh, it's it's gonna want to load. That's why I have it. Oh, yeah, amazing. Yeah. Yeah. Error occurred. Wow. Okay.
00:08:59.440 | I don't know. Okay. So I broke it. If I write a cool module with Dagger that say does like TypeScript build or something, right? I can share this on the Dagger version.
00:09:13.440 | And maybe I wrote that module in TypeScript and you're writing your modules in Python. You can just install my module and you have these native bindings in your language to work with Dagger modules cross language. So anyway, that's that's my point in that we're not when you pick a language, you still get to benefit from the whole Dagger ecosystem.
00:09:33.440 | ecosystem, and we don't have images on these.
00:09:38.300 | That's OK.
00:09:38.900 | There's some sweet animations there of code happening.
00:09:41.900 | It's so cool.
00:09:42.720 | Yeah, like the coolest animation you could think of.
00:09:46.660 | Awesome.
00:09:47.160 | So I think we could probably skip forward here.
00:09:49.820 | And so hopefully we've installed or we're downloading--
00:09:54.080 | maybe we're still downloading Dagger.
00:09:56.800 | So I'll run through real quick the basics of Dagger.
00:10:00.680 | So hopefully that's big enough.
00:10:01.880 | Maybe I'll make it a bit bigger.
00:10:03.500 | Yeah, it's good.
00:10:06.440 | There we go.
00:10:06.940 | Yeah.
00:10:07.880 | Cool.
00:10:08.420 | So we've installed Dagger.
00:10:10.880 | We've got these things like container runtime somewhere.
00:10:15.520 | And so the first thing we can do is create containers.
00:10:17.580 | So if I'm in Dagger shell, which I think I am over here--
00:10:22.880 | That's definitely going to get bigger.
00:10:25.300 | Yeah.
00:10:29.260 | So I'm in Dagger shell and I can say container, I think.
00:10:33.820 | I don't know.
00:10:34.600 | Let's go over here.
00:10:39.880 | Fighting the internet.
00:10:50.880 | So there's a few different ways of using Dagger on the command
00:10:55.720 | line and including kind of a non-interactive, just fire off a
00:11:00.100 | Dagger command to run one of these workflows, a function that's
00:11:03.120 | one of these workflows.
00:11:04.280 | Or you can use it in this kind of interactive shell mode
00:11:07.020 | that he's showing here.
00:11:08.500 | Yeah, and it's all about building building blocks, right?
00:11:10.440 | So with the basics of Dagger, you have, like I mentioned earlier,
00:11:13.760 | things like containers, directories, LLMs.
00:11:16.600 | But with our code, we're actually going
00:11:18.160 | to be building larger blocks out of those blocks
00:11:21.160 | to assemble like an actual part of a workflow.
00:11:24.100 | And then I'll take those blocks, build bigger workflows out of those.
00:11:27.080 | So as we're using shell, we're always
00:11:29.200 | going to be interacting with some level of a workflow here.
00:11:32.160 | But like with container, I can say from Alpine,
00:11:36.380 | and now I've got an Alpine container.
00:11:38.380 | And I can get-- we can do things with that, like anything
00:11:44.080 | you might want to do with a container, right?
00:11:46.280 | So I could literally say, give me a terminal.
00:11:49.680 | And now I've got a terminal in a container.
00:11:51.880 | And this is the exact kind of tools that we're actually giving
00:11:54.180 | to our agent as we're building these pipelines, right?
00:11:56.560 | So it can-- you can give it a container,
00:11:59.520 | but you can build a specialized workspace for your agent
00:12:02.700 | to do things like write the code.
00:12:05.260 | So there's a lot of setup to say that we've
00:12:07.020 | got all these primitives that we can give to agents
00:12:11.620 | to build some really effective software engineering agents
00:12:14.760 | by giving them the exact tools they need to complete the job.
00:12:18.200 | But also, like we mentioned earlier,
00:12:21.640 | people use Dagger for CI/CD because you
00:12:23.760 | can create these workflows for running your tests
00:12:26.640 | for your application or whatever.
00:12:27.900 | And the cool thing is that if you've done that,
00:12:29.920 | you can take that same code that you wrote for running your tests
00:12:32.500 | and give that to your agent.
00:12:33.640 | So now your agents isn't just guessing at some code
00:12:36.140 | that it's generating, but it can actually
00:12:37.560 | run your actual tests the same way that your developers
00:12:40.120 | and your CI do to make sure that the code that's generating
00:12:43.480 | is valid code and that it can iterate on these things
00:12:46.600 | within the agent.
00:12:48.100 | So this is all we're going to build right now.
00:12:50.120 | And we were just talking to somebody outside
00:12:52.000 | and we're going to build right now.
00:12:54.500 | And we're going to build right now.
00:12:55.920 | And we're going to build right now.
00:12:56.920 | And we're going to build right now.
00:12:57.460 | And we're going to build right now.
00:12:58.500 | And we're going to build right now.
00:12:58.920 | And we're going to build right now.
00:12:59.920 | And we're going to build right now.
00:13:00.920 | And we're going to build right now.
00:13:01.420 | And we're going to build right now.
00:13:01.920 | Because of people like a product manager who's discovered
00:13:05.420 | vibe coding or a team that's using AI-powered IDEs or whatever,
00:13:12.940 | that people are cranking out these massive PRs for him to review,
00:13:17.780 | like 25,000 line PRs.
00:13:20.300 | And the PRs don't even stay static.
00:13:22.760 | So he was like, oh, I just got this PR and I have to review it.
00:13:26.360 | And then I come back and now there's five more commits on it.
00:13:30.500 | And so you've got this thrash happening.
00:13:33.440 | And so part of the reason why CI and AI bringing that together
00:13:38.800 | makes so much sense is we actually have
00:13:41.140 | to bring some balance back.
00:13:43.620 | We've got this fire hose.
00:13:44.900 | We can all now just create so much code.
00:13:48.140 | But how do we make sure this is actually code that we can test
00:13:51.480 | and that we can deploy with some kind
00:13:52.980 | of confidence at some point?
00:13:54.540 | So we need to balance out and make sure
00:13:56.340 | that there's software delivery workflows that are there
00:14:00.640 | to test and build and validate things
00:14:03.900 | before we put them out in production.
00:14:05.480 | So some of what we'll get into today.
00:14:07.740 | Awesome.
00:14:08.240 | So yeah, let's actually get into writing something.
00:14:10.400 | So I zoomed up a bit so you can see where I landed it
00:14:12.840 | in the docs here.
00:14:13.920 | On the left side, we have Quick Start.
00:14:15.840 | And I clicked on Build the CI Pipeline.
00:14:18.080 | And that's basically to get us to a point
00:14:20.880 | where we have a project that we're going
00:14:22.540 | to make an agent inside of that that can build new features
00:14:25.760 | for that project.
00:14:26.600 | So it's just going to be a real quick thing
00:14:28.260 | where we kind of set up this example project with functions
00:14:33.200 | that know how to build and test the project.
00:14:35.800 | So I'm on this page.
00:14:37.560 | And we've already talked through installing Dagger.
00:14:41.020 | And we talked a bit through the basics.
00:14:43.380 | So now we have this example application called Hello Dagger
00:14:46.220 | Template.
00:14:47.060 | And if you go to GitHub and say, use this template,
00:14:50.760 | you can name it whatever you want, like Hello Dagger Workshop
00:14:53.600 | or just Hello Dagger.
00:14:54.520 | It doesn't matter.
00:14:55.860 | You can create a repo in your GitHub from this template.
00:14:59.940 | And the important reason for that versus cloning it
00:15:02.740 | is that that's going to make it way easier when we actually push
00:15:05.020 | things to GitHub in a little bit to make it easier for you
00:15:08.520 | to run the GitHub actions that actually run the agent.
00:15:13.020 | So we're going to use that template.
00:15:14.680 | And I've done that over here in this repo where
00:15:19.620 | I have my Hello Dagger Py.
00:15:23.080 | Because I've done this in every language,
00:15:25.380 | you can use whatever language you want to use.
00:15:28.100 | I'll be walking through Python today because I think that's
00:15:30.760 | probably what a lot of people here today
00:15:33.100 | are most comfortable with.
00:15:35.580 | But if you're not, I can switch between languages.
00:15:37.700 | Just raise your hand and say, show me Go.
00:15:40.940 | And that's OK.
00:15:42.060 | So I've got--
00:15:44.640 | let's see.
00:15:45.760 | So I've got this application in my GitHub now.
00:15:48.320 | I have cloned it to my machine.
00:15:50.320 | So now I can look at the code.
00:15:53.960 | And it's like this view app that has a bunch of things in it.
00:15:59.320 | But the main thing is we want to be
00:16:01.460 | able to make an agent that develops it, right?
00:16:04.900 | Optionally, you can configure Dagger Cloud, which
00:16:06.760 | is-- let me just start loading that web page now.
00:16:10.000 | It's basically a visualization.
00:16:14.360 | So you can really easily see what your agent's doing, right?
00:16:17.000 | Because that's the hardest part of building agents a lot of the time,
00:16:20.000 | is understanding what are they tripping on?
00:16:22.740 | What's actually going on inside the agent?
00:16:24.660 | How is it interacting with its tools?
00:16:26.240 | What tools is it even seeing?
00:16:29.020 | So with this visualization, you're able to really easily see everything
00:16:34.160 | that your agent's doing.
00:16:35.480 | And that's helped me a lot, like, develop my prompts.
00:16:37.340 | Like, if I see-- the prompts and environments, right?
00:16:40.280 | If I see a lot of times that, OK, the agent fails because it tries
00:16:43.980 | to call this tool incorrectly, I can improve, like,
00:16:46.780 | the description of the tool.
00:16:48.280 | Or maybe I need to change how the tool works completely.
00:16:51.520 | And so being able to see how the agent's behaving is a huge part of that.
00:16:55.620 | Whether you're using cloud or any other thing to visualize your agents,
00:16:59.480 | that's, like, the most important part of making it reliable.
00:17:04.020 | OK, so we've cloned the project.
00:17:06.420 | We now want to create a Dagger module.
00:17:08.340 | So if you've installed Dagger, you'll run this command, DaggerInit,
00:17:12.660 | with whatever SDK you're using.
00:17:14.300 | So we have these tabs here.
00:17:16.220 | So I'm going to be using Python.
00:17:17.480 | And then the name of the module is going to be HelloDagger.
00:17:20.400 | And that's important because that is basically the name of our object
00:17:25.480 | that gets created.
00:17:26.360 | So if I open this up, and I've run DaggerInit,
00:17:29.780 | and now I can open in my dot Dagger folder.
00:17:32.520 | I'm sorry, that's really small.
00:17:33.760 | I don't remember how to make that bigger in Zed, but we can--
00:17:38.580 | It's in the-- it's in the preferences.
00:17:41.520 | To zoom the sidebar.
00:17:42.560 | Command comma.
00:17:47.320 | You command comma, and it'll--
00:17:49.620 | Yeah.
00:17:49.920 | And you just put the-- there's a-- the top.
00:17:52.080 | There's a-- there's this.
00:17:55.760 | Well, I guess you don't have your set, but it's a font size.
00:17:59.120 | Font size.
00:18:01.140 | That one.
00:18:02.540 | UI font size.
00:18:03.820 | Changed it to like 25 or something.
00:18:05.820 | Watch it.
00:18:07.200 | There you go.
00:18:07.900 | Save that.
00:18:09.180 | Boom.
00:18:10.140 | So now hopefully we can see the sidebar a little bit better.
00:18:13.020 | So I'm in this dot Dagger directory.
00:18:15.640 | And apparently I've written go for this one.
00:18:18.600 | It's just-- oh, because that's the wrong project.
00:18:21.060 | Cool.
00:18:22.200 | So let's go to the correct project.
00:18:23.960 | Let me open that up and close all these things.
00:18:29.340 | So I'm in the correct project.
00:18:31.380 | And in my dot Dagger, I've got this source, hello Dagger, main pie.
00:18:35.560 | And so we would have generated when we said Dagger in it.
00:18:39.440 | It'll have basically these files, but some different content.
00:18:43.340 | So it's going to have the basic generated things to get you started building modules.
00:18:49.280 | But we're going to say, let's see, Dagger functions.
00:18:54.580 | It'll show us what's available in this Dagger module that just got created.
00:18:57.660 | And so this is basically how you interact with Dagger.
00:18:59.380 | This is with the Dagger CLI.
00:19:01.000 | And you have this code that are just functions of how to interact with your application.
00:19:07.120 | So for example, let's build one.
00:19:09.300 | We have a container.
00:19:11.600 | If we go down to this function, and you see we're just building building blocks.
00:19:15.460 | We have a function that gives us a Dagger container that is from this base.
00:19:21.940 | And we put these files in it.
00:19:23.560 | And we run this command.
00:19:25.120 | And so in that container, when we want to do a build of our app,
00:19:28.720 | we can call that other function to get that container with our code in it,
00:19:34.780 | run another command, and then get a directory from that.
00:19:37.960 | And so this is like really basic Dagger stuff of how you create your dev tools using Dagger.
00:19:43.040 | This is like good to call out here.
00:19:45.580 | So originally we had this example from Kyle where he showed us running like a container.
00:19:51.520 | And then we said, give me a scratch container.
00:19:53.800 | Oh wait, give me an imp from an image from Alpine or from Node or from whatever.
00:19:58.720 | And then you can layer on more things like add a directory, a source code to that, run exec a test
00:20:04.780 | command, whatever, right, chaining these things together.
00:20:07.340 | So you notice I'm using this builder pattern here in code instead of in like a CLI.
00:20:13.140 | So it's all the same API under the hood.
00:20:15.620 | It's just in this, in this case, he's using a Python SDK into that same API.
00:20:21.180 | But the same things are happening either way, same one unified cache where all that stuff is being,
00:20:27.240 | all those cached operations are at, and one API.
00:20:30.240 | So that's why it becomes really easy to use different languages, different language SDKs,
00:20:35.740 | because it's ultimately all one API under the hood.
00:20:39.240 | And so we got this code in this, the next step of this, where it says construct a pipeline.
00:20:43.300 | We've copied this code into that main file.
00:20:45.800 | And that has all of those functions like publish, build, tests, and that build_env one we looked at,
00:20:52.800 | and build_env as in like your build environment.
00:20:56.080 | And so when we run dagger functions, we'll have those shown up here with their descriptions and everything from the code.
00:21:02.180 | So now we've, at this point, like we, we've got the project that we want to build the agent in.
00:21:07.140 | We've got some dagger functions that let us build and test the project.
00:21:11.240 | We've got the project itself.
00:21:12.320 | So now let's actually get the agent started.
00:21:15.240 | And so now I'll zoom out again so you can see because I jumped to the next page here,
00:21:19.620 | which is add an AI agent to an existing project.
00:21:23.420 | And so we're starting from exactly where we just left off there with that previous guide where we pasted in that code.
00:21:30.180 | We have our build, build_env, publish, test, and our dagger functions.
00:21:36.240 | Lots of useful functions, but the expectation was the human was probably running those, right?
00:21:42.740 | Or you were having them run in CI.
00:21:45.480 | You'd kind of set that up, but nothing really agentic yet.
00:21:48.820 | Right. So we have, you know, we're just running our unit tests or our build and creating a production container.
00:21:54.180 | And this is what you as a developer or your CI environment are running these functions.
00:22:00.240 | But now we want to create an agent for developers to interact with or, you know, to run anywhere.
00:22:06.300 | But also our agents should be able to use these functions as well.
00:22:10.360 | And so we're in this next guide.
00:22:12.360 | And we're going to now create a sub module.
00:22:15.360 | Because I mentioned, like, our agents want these refined environments where we give them access to exactly what tools they need to complete their tasks.
00:22:24.360 | And nothing more than that.
00:22:25.420 | No, no, wait, I thought you were going to give agents, like, every possible tool.
00:22:28.420 | You want to let them have, like, a thousand functions that do very powerful things and just let them run crazy.
00:22:35.420 | Is that not the best practice?
00:22:37.420 | Well, yeah, maybe not based on the smiles across the room.
00:22:41.420 | Okay.
00:22:43.420 | What happens if the tool needed changes at runtime?
00:22:47.480 | Oh, yeah.
00:22:48.480 | So the question is, what if the tool needed changes at runtime?
00:22:50.480 | How about dynamic kind of tools, right?
00:22:52.480 | So in a lot of cases, we're working with MCPs.
00:22:54.480 | We might have a lot of static tool kind of experience.
00:22:57.480 | You know, what if things change?
00:22:59.480 | What does happen, Kyle?
00:23:00.480 | Well, so the main thing is, like, you want the right amount of tools for that agent to solve its task, whatever that task is.
00:23:07.420 | Like, it needs the flexibility to be able to solve complex problems.
00:23:13.420 | So it's not just going straight down a workflow and saying, okay, I do this, and I do this, and I do this, because you don't really need an AI to do that.
00:23:20.420 | It needs the amount of tools to select, to choose its own path, to solve whatever task you throw at it.
00:23:26.420 | But you don't want so many tools that now this is a generalized agent that does anything, right?
00:23:31.420 | It needs to have some amount of focus so that it can solve a specific set of problems really well.
00:23:37.420 | But we will see, like, in the agent loop that's going to happen, we will see the ability for the LLM to see this, like, menu of tools it has, and for it to select the right tool at the right time, given the context.
00:23:50.420 | Yeah, but yeah, definitely, like, a big part of iterating and building these agents is determining, like, the scope of the tools.
00:23:58.420 | So, like, the kind of the balance between flexibility and reliability, where you want it to be able to solve a breadth of problems.
00:24:07.420 | So it needs a variety of tools that it might need. You don't know exactly what it's going to need ahead of time.
00:24:12.420 | But you don't want to give it so many that now it's getting lost and confused and fails half the time, right?
00:24:17.420 | And so that's what we're going to focus on here with this.
00:24:20.420 | We're going to create a sub-module, basically, that is kind of its playground.
00:24:25.420 | It's a specific set of tools that lets it edit our source code.
00:24:29.420 | And so if you've worked with maybe agent frameworks in the past that have, like, file system tools, we're actually going to build that in our own code right now.
00:24:39.420 | And it's just a few lines of code, so don't let me scare you with that.
00:24:42.420 | But that's the idea, is, like, we're creating these building blocks, and as you scale this up, you can consume these that other people have written.
00:24:50.420 | You don't have to write it all from scratch. But for the practice of building this as a workshop, we're going to write it all.
00:24:56.420 | And so--
00:24:57.420 | All right. What are we going to give the-- what are we going to give-- what are we putting this workspace?
00:25:00.420 | What kind of functions?
00:25:01.420 | Yeah. So we do another dagger in it here, and we say dot dagger slash workspace.
00:25:06.420 | So we've created in our file system another subdirectory here, workspace under dot dagger.
00:25:12.420 | And so this is another dagger module.
00:25:15.420 | And this one's just going to have-- just the functions that we want the agent to have access to.
00:25:19.420 | So you can imagine it wants to read the files in your source tree.
00:25:23.420 | So we have a function, and again, a file is one of those core components of dagger.
00:25:28.420 | And so we just-- our workspace has a dagger directory, which is our source code.
00:25:33.420 | And so we give it a function to read a file from that.
00:25:36.420 | So it just gets the dot contents of that file.
00:25:40.420 | And that's just the dagger API to say, this is a path through a file.
00:25:44.420 | I can do lots of things with the file.
00:25:45.420 | One of those is to look at the contents.
00:25:48.420 | Another function it needs is to be able to write files to the workspace, obviously.
00:25:52.420 | And so it's very similar API here where we say, okay, give me the path and also the contents
00:25:58.420 | to write to that file.
00:26:00.420 | And then it needs to be able to know what files are in the workspace.
00:26:03.420 | So it needs to be able to list the files.
00:26:05.420 | And it's just going to literally do a tree in that workspace.
00:26:08.420 | So it can quickly see the file structure of your code.
00:26:12.420 | And so now basically with those three, we have another one that we're going to look at in a second.
00:26:16.420 | But with those three, now it can do all the code editing you might ask it to do within your file system.
00:26:23.420 | And with more complex projects, you might need more advanced capabilities of these.
00:26:29.420 | Like you might need to be able to read specific lines from a file or scan files or insert lines into files.
00:26:36.420 | But with our kind of demo agent that we're building right now, it's like just the most basic
00:26:41.420 | where we can just read and write files and list the files.
00:26:44.420 | So if the agent had access to this workspace object, it would see those functions as tools.
00:26:51.420 | read file, write file.
00:26:54.420 | Exactly.
00:26:55.420 | It will in a minute, yeah.
00:26:56.420 | Not yet.
00:26:57.420 | Not yet.
00:26:58.420 | We haven't plugged the brain into the robot body yet.
00:27:00.420 | Right.
00:27:01.420 | So right now we're building, if you think about the agent as a robot body with a brain plugged into it,
00:27:05.420 | we're building the robot body.
00:27:06.420 | And the brain is going to come in just a second here.
00:27:09.420 | Which could be any LLM, kind of a brain in a jar, right?
00:27:12.420 | Okay.
00:27:13.420 | So our last one, finally, that I mentioned earlier, is test.
00:27:17.420 | So when it generates this code in its workspace, it needs to be able to test to make sure that the code it generated is correct.
00:27:25.420 | And if it didn't, it'll get the test failures and iterate until it's producing good code, right?
00:27:31.420 | And so this is kind of the most important part of building this good agent is some sort of validation tool,
00:27:36.420 | whether that's like a test or a lint or just something to check that what it's generated is correct.
00:27:43.420 | Or maybe it's all of these things, right?
00:27:45.420 | There could be different levels of complexity.
00:27:47.420 | But anyway, here, now we've got this workspace.
00:27:50.420 | So if I go in my workspace, I have this exact code over here.
00:27:56.420 | And if I run, I think I have the function down here.
00:28:00.420 | If I say dagger dot dash M, so now dash M points to a specific dagger module.
00:28:06.420 | And I say functions, remember before we ran those dagger functions.
00:28:09.420 | If I run dagger dash M dot dagger slash workspace functions, I'll see exactly those functions that we just created.
00:28:19.420 | Okay.
00:28:20.420 | So the next step is we want our main dagger module to have that as a set of tools I can use.
00:28:26.420 | And so we're going to say dagger install that workspace module.
00:28:31.420 | So now it's installed as a dependency of my main module.
00:28:34.420 | So it has this object available and we'll see why that's really cool in a second.
00:28:40.420 | But basically all your dependencies of dagger, like I mentioned, like we have, you know, I can look at this real quick.
00:28:47.420 | We have a big community of people building things with dagger.
00:28:52.420 | And with that we have the dagger verse, which is this massive index of like thousands of dagger modules that do different specialized things.
00:28:59.420 | But whenever you install one of these into a dagger module, it creates, if you look at my dagger JSON in this project, we have this list of dependencies.
00:29:11.420 | And so your dagger module has basically its own dagger clients that is the core dagger API in addition to all of your dependencies.
00:29:21.420 | And so that when you're writing code, you can, like I mentioned earlier, like native in this language, you'll see all of these things available on the main dagger clients.
00:29:31.420 | So you can do all these complex tasks.
00:29:33.420 | So basically we've built two modules already.
00:29:36.420 | We've built this workspace module and the main module where we're doing our tests and builds.
00:29:42.420 | And so we want to create the agent now that can take that workspace and our tests and we can actually ask for new features or modifications or whatever.
00:29:52.420 | So that's the next step in this guide we're looking at where we want to create an agentic function.
00:29:58.420 | Could we have mixed and matched, like, could we have written that workspace in TypeScript or in Go and still installed it into our Python module?
00:30:09.420 | Yeah, exactly.
00:30:10.420 | So like the other modules, any individual module can be written in any language and you can mix and match however you want.
00:30:17.420 | I knew the answer to the question.
00:30:18.420 | I was just checking.
00:30:19.420 | But, but yeah, we see people do this a lot where they have different teams where like, you know, maybe there's a front end platform team and then a black back end platform team.
00:30:28.420 | And maybe these folks are TypeScript, these folks are Go, but they can interop and use their stuff.
00:30:32.420 | So, yeah.
00:30:33.420 | Yeah.
00:30:34.420 | So like everything, every task or workflow or whatever that you do with dagger is a function in your code.
00:30:40.420 | And so an agent is no different, right?
00:30:42.420 | It's just going to be another function.
00:30:43.420 | And we're going to call this one develop because we're going to ask it, we're going to give it an assignment to complete our project.
00:30:49.420 | And it's going to complete that assignment.
00:30:53.420 | So that the develop function is our agent.
00:30:56.420 | And so this is going to give us the code to copy and I'll just open it in the editor so that it looks a bit nicer.
00:31:03.420 | Don't worry, it's only like 500 lines.
00:31:05.420 | It's totally fine.
00:31:06.420 | And you know what, it's really short.
00:31:09.420 | Oh wait, it's not 500 lines?
00:31:11.420 | Yeah, this is it right here.
00:31:13.420 | So we have a few lines.
00:31:15.420 | We have it all like spaced out nicely.
00:31:16.420 | Yeah.
00:31:17.420 | So we have a new function called develop and it takes an assignment.
00:31:22.420 | And this annotated thing is just a Python way of getting us these doc strings for the parameters.
00:31:30.420 | But in different languages, like we can see back here, if we go like go, your arguments just look like this.
00:31:36.420 | Where this little comment is basically the help string when you're using the Dagger CLI and say Dagger functions.
00:31:42.420 | It'll say the assignment parameter is assignment to complete, which is really cool.
00:31:48.420 | And we see our source here, which is like our project source.
00:31:51.420 | But of course, we don't want to have to pass that as a parameter when we're calling our agent.
00:31:55.420 | So there's this cool thing with Dagger where you say default path is slash and that's going to be the root of our Git repo.
00:32:01.420 | So if we don't pass in explicitly a source parameter, it's just going to pass in our Git repo as that parameter.
00:32:08.420 | And so now we just have to say develop, build me a cool new feature and it's going to kick off our agent.
00:32:14.420 | So let's look at the components of the agents real quick.
00:32:17.420 | So the environment is like the main thing, right?
00:32:20.420 | And I've used that word a lot today.
00:32:22.420 | And hopefully a lot of people are using the same word in the same way.
00:32:26.420 | But you have your robot body in the brain, like Jeremy said, where your environment is basically not just the tools that it's using to complete the task, but also your inputs and outputs for the agents.
00:32:42.420 | Any objects or state that it's working with, all of this is the environment.
00:32:47.420 | And so we want to construct this environment and then plug in the LLM, which is our brain and say, here's your environment, here's your task slash prompt and complete the task.
00:32:59.420 | And so this is the environment we put together where the assignment is a string input.
00:33:07.420 | So we have this cool kind of way of declaratively building your prompt, right, where our assignment is the assignment to complete, but this workspace input is a workspace with tools to edit and test code.
00:33:19.420 | So now that our agents, when we connect these things, will see this as the description of this thing they can use and say, okay, we're building up this prompt by annotating our code basically.
00:33:32.420 | And so with this workspace input thing, that's referring to the sub module we just created.
00:33:37.420 | So if--
00:33:38.420 | The workspace.
00:33:39.420 | Yeah.
00:33:40.420 | Exactly.
00:33:41.420 | So if we called that something else like Foo workspace, and we installed that, this would be with Foo workspace input, right?
00:33:47.420 | We're dynamically generating all of these functions for the environment type to say, any objects of my dependencies I can have as an input or an output of my environment.
00:34:01.420 | And so we noticed that we also have a workspace output, which is the completed task because all objects in Dagger are immutable.
00:34:11.420 | And so I give it an object, it's going to do a bunch of things and give me back a different object that's completed task.
00:34:19.420 | And maybe that's like a boring detail, but the main thing is the thing I passed in is still going to be the same, but it's going to have a new version that's given me back called completed.
00:34:30.420 | I mean, I think a lot of people are dealing with this kind of stuff now, right?
00:34:33.420 | With the different APIs and like doing a bunch of JSON parsing and validation, right?
00:34:38.420 | And trying to-- you know, there's different frameworks doing it in different ways.
00:34:41.420 | But you could just think of it as this is our way of saying like, here are the typed inputs.
00:34:45.420 | These are typed inputs.
00:34:47.420 | We're expecting a typed output back in the end.
00:34:51.420 | And this gives us a way to ensure that we're getting what we actually asked for.
00:34:56.420 | That's right.
00:34:57.420 | And now, next we need our prompt.
00:35:00.420 | So we have the environment and the prompt and we give both of those the agent basically.
00:35:05.420 | So the prompt, I believe, is just a bit lower here if you're following along here.
00:35:11.420 | So it wants you to create a dot dagger slash develop prompts dot markdown.
00:35:16.420 | And it looks like this.
00:35:17.420 | So I'll just open it again over here on my editor.
00:35:20.420 | So this is our prompt.
00:35:23.420 | And so we're saying you're a developer on this project.
00:35:26.420 | You're going to get an assignment and the tools to complete it.
00:35:30.420 | Your assignment is dollar sign assignment.
00:35:32.420 | And so this is basically-- it's going to be templated in by the assignment in our environment.
00:35:40.420 | So it's going to drop that right in that prompt so that the agent itself doesn't have to go read this other variable in its environment and knows, OK, my assignment is make this cool new feature.
00:35:51.420 | And then we have a bit of prompt structure here, right, where if you've built a lot of these agents, you've probably kind of refined how you build your prompts and what those structures look like.
00:36:01.420 | This is a really simple agent, so it doesn't have a ton of structure.
00:36:04.420 | But we do say before you write code, make sure you analyze the workspace to understand the project structure.
00:36:11.420 | So it's not just going to create some garbage and be like, cool, I made this new file, but I didn't look at the project first.
00:36:17.420 | Don't make unnecessary changes because sometimes you'll see, especially certain models without the right constraints, will go make the change you asked for and then change for other things and be like, cool, it looks good, ship it.
00:36:30.420 | And always run the test.
00:36:32.420 | So we do have to ask it to run the test once it's made those changes.
00:36:37.420 | So it's not just going to see the test function and be like, oh, I should probably call that.
00:36:41.420 | We want to make sure to tell the LLM, like, OK, you have a tool that can validate the code you're writing.
00:36:46.420 | Make sure you use that tool.
00:36:48.420 | And then don't stop until you've completed the assignment in the test pass.
00:36:52.420 | So this is telling it, you know, keep working until you've satisfied what I asked it to do and the test pass.
00:36:59.420 | That's the reinforcement.
00:37:00.420 | You can't like totally run the test twice.
00:37:01.420 | Oh, yeah.
00:37:02.420 | You better.
00:37:03.420 | And this comes from experience, right?
00:37:04.420 | Maybe a third time will help too, I'll say.
00:37:06.420 | It doesn't hurt at all.
00:37:07.420 | Because, yeah, and maybe in all caps.
00:37:09.420 | Because it's like, what we find, we end up running evals on these things, right?
00:37:13.420 | Where we'll try different LLMs plugged in and then we'll iterate on the prompts until we're getting the results, the consistency we want across the different ones.
00:37:24.420 | And, yeah, it comes from experience of knowing how they veer off track and et cetera, how we're writing these.
00:37:31.420 | Yeah, and that's like what I mentioned earlier, like using something like Dagger Cloud to be able to see the visualization of all the work the agent's doing.
00:37:40.420 | If I'm frequently seeing like, okay, the agent's just calling write file and then returning, I know that, okay, I have to tell it to look at the code, I have to tell it to test the code.
00:37:49.420 | And that's going to be different for every model.
00:37:51.420 | And especially like the prompt structure is different for different models.
00:37:54.420 | Is it possible to implement things like reflection agents as well to police each other?
00:38:00.420 | Yeah, yeah.
00:38:01.420 | So the question is like, can you implement like reflection agents to police each other?
00:38:04.420 | And that's something, I probably have an example of that I can show at the end if we have time.
00:38:08.420 | But yeah, like, remember, with this, each agent is just a Dagger function.
00:38:14.420 | And so you can create all these agents layered on other agents.
00:38:18.420 | And even in your environment, you can actually put an agent in the environment and say, hey, you have this agent at your disposal if you need it to do something, right?
00:38:29.420 | And I have examples of that too.
00:38:30.420 | But it's like similar to the concept of like Google's A2A where you say, if you're not familiar with that, it's basically this structure where you tell an agent, listen, you can do these things.
00:38:43.420 | But you also can talk to these other agents, and that's what each of these other agents do.
00:38:47.420 | And so if you need to, you can reach out to them and say, hey, other agent, I need you to tell me how to write TypeScript.
00:38:54.420 | And that comes back, right?
00:38:55.420 | So you can put agents in environments.
00:38:57.420 | It's all just piecing functions together, right?
00:39:00.420 | It's just the same code we've always been writing, but now there's an LLM component.
00:39:04.420 | Cool.
00:39:05.420 | So now this line right here, line 94, most important line of the workshop because this is the agent where we've actually taken our Dagger client and .llm.
00:39:14.420 | So this is another type within the Dagger client.
00:39:17.420 | You should make it bigger just for a second, you know?
00:39:19.420 | Sure, yeah.
00:39:20.420 | So it's off the screen, since it's so important.
00:39:22.420 | I feel like it's not even getting that much bigger.
00:39:24.420 | It's just--
00:39:25.420 | So huge.
00:39:26.420 | Yeah, there we go.
00:39:27.420 | Yeah.
00:39:28.420 | Cool.
00:39:29.420 | So like, we've said, all right, from the Dagger client, we need this LLM type.
00:39:34.420 | We give it an environment.
00:39:36.420 | We give it a prompt.
00:39:37.420 | And that's the agent.
00:39:38.420 | So now we've got this thing work that is a Dagger LLM with these things.
00:39:43.420 | See, people want pictures of it.
00:39:44.420 | You've got to center it.
00:39:45.420 | Yeah, make it look good.
00:39:46.420 | Here you go.
00:39:47.420 | Boom.
00:39:48.420 | If you need your pictures, you can get one with Kyle and commemorative.
00:39:53.420 | We've got like frames outside.
00:39:55.420 | You can slide it in.
00:39:57.420 | Cool.
00:39:58.420 | I'll autograph it.
00:39:59.420 | So that's the agent.
00:40:00.420 | Like, that's literally, because we've asked it, like, we've said in this prompt.
00:40:04.420 | We didn't really ask.
00:40:05.420 | We told it.
00:40:06.420 | We told it in the prompt.
00:40:07.420 | This is your task.
00:40:09.420 | This is how you work.
00:40:10.420 | Don't stop until it's done.
00:40:12.420 | And so now this work variable in our code is the completed work.
00:40:18.420 | And so from that work, we can look back at the environment in that and say,
00:40:22.420 | I have this output called completed.
00:40:24.420 | Do you remember in our environment, we defined a workspace output called completed.
00:40:28.420 | And this thing should be a workspace.
00:40:31.420 | If it's not, somebody screwed up.
00:40:33.420 | That happens sometimes.
00:40:35.420 | A good final check.
00:40:36.420 | Exactly.
00:40:37.420 | A good type check.
00:40:38.420 | Yeah.
00:40:39.420 | And so from that workspace, we want to grab the completed directory,
00:40:41.420 | which is the source.
00:40:43.420 | So if you remember in our workspace object here, it has an attribute called source, which
00:40:50.420 | is the directory.
00:40:51.420 | And so this is all like a few layers of complexity.
00:40:55.420 | But we've said in that workspace, we have a source thing.
00:40:57.420 | That's a directory.
00:40:58.420 | And ignore the node modules folder because maybe that's going to break in my machine.
00:41:02.420 | Yeah.
00:41:03.420 | And then now that we've got that, just to make triple sure.
00:41:07.420 | Because remember, I mean, we did tell it three times to run tests.
00:41:10.420 | You told it to test.
00:41:11.420 | But now we get this back.
00:41:12.420 | And in our code, we're saying, all right, now run the test.
00:41:15.420 | Because this is all the same code that we're using throughout our project to run tests.
00:41:19.420 | So we can say, okay, completed.
00:41:21.420 | Now manually run the tests.
00:41:22.420 | And if that fails, you could maybe kick it back to the LM and say, hey, this failed.
00:41:27.420 | Try harder.
00:41:28.420 | That's pretty huge, right?
00:41:29.420 | So that's like trying to put the agents on rails or give them guard rails, whichever metaphor
00:41:34.420 | you like better.
00:41:35.420 | But it's like, you know, that's pretty key because we're trying to like let them do the
00:41:40.420 | creative stuff they do, the generative stuff they do, like write some code for us.
00:41:43.420 | But we need to enforce certain standards, right?
00:41:46.420 | Yeah.
00:41:47.420 | It could be compliance things.
00:41:48.420 | It could be, like you said, linting.
00:41:49.420 | So we don't just dump that garbage back to your machine.
00:41:53.420 | Yeah.
00:41:54.420 | And remember, all these changes that it was making as it's iterating on these things, that
00:41:58.420 | was all done in a container.
00:41:59.420 | It's not just changing your file system as it's doing its work.
00:42:02.420 | And that's a key thing, too, because now maybe you have 10 of these agents running.
00:42:07.420 | They all have their own sandboxed workspace where they're editing these files.
00:42:10.420 | They're not messing up your local state.
00:42:13.420 | And before we do mess up our local state, we triple check that the tests pass.
00:42:18.420 | And then we say, okay, return that completed directory.
00:42:20.420 | And so now this function, and we'll just triple check here on the guide side.
00:42:25.420 | They didn't miss anything.
00:42:27.420 | We say Dagger functions, and we have this develop one that shows here.
00:42:33.420 | So now if I go into Dagger shell, which is hopefully what it asks us to do, it is.
00:42:37.420 | I say hopefully.
00:42:38.420 | I wrote this, so we're just checking myself here.
00:42:42.420 | And I can go in and say Dagger.
00:42:45.420 | Now before I do that, one thing I don't think I called out at the very start here was that we had to configure an LLM provider.
00:42:53.420 | So with Dagger, you bring your own model.
00:42:55.420 | You can use OpenAI, Gemini, Anthropic.
00:42:58.420 | Local models.
00:43:00.420 | Olama, Docker model runner, like literally any.
00:43:02.420 | Bedrock.
00:43:03.420 | Anything you can hook up to.
00:43:04.420 | Bedrock.
00:43:05.420 | Yeah.
00:43:06.420 | So you do have to configure some environment variables to be able to, for Dagger, to make API calls to that.
00:43:13.420 | Right?
00:43:14.420 | Because we're just the agent with the tools.
00:43:17.420 | The model is living somewhere else.
00:43:19.420 | And so this is this configuration page.
00:43:23.420 | Configuration slash LLM shows all the different options on how to configure things.
00:43:28.420 | One really cool thing to call out.
00:43:30.420 | I'm just going to type something really scary.
00:43:33.420 | Oh my gosh.
00:43:38.420 | So Dagger also has cool secrets provider integrations.
00:43:42.420 | So I don't have my actual API key echoed there.
00:43:46.420 | I just have my one password reference.
00:43:48.420 | I'm just sitting in one password somewhere.
00:43:51.420 | And so, let's see.
00:43:54.420 | Yeah.
00:43:55.420 | So it's just pointing at this credential.
00:43:56.420 | Cool.
00:43:57.420 | Yeah.
00:43:58.420 | And then if I reveal in plain text.
00:43:59.420 | Whoa.
00:44:00.420 | So I've configured this in my environment.
00:44:04.420 | So now when I say Dagger, it's going to take a second to spin up.
00:44:10.420 | And this is the part where if you're struggling a bit with Wi-Fi, this might be a bit tough.
00:44:15.420 | But it's okay.
00:44:16.420 | Because if you are following along, we're going to push this to GitHub in a second.
00:44:19.420 | And it's going to run in GitHub.
00:44:21.420 | And it's going to be on GitHub's network.
00:44:22.420 | So we don't have to be beholden to that.
00:44:25.420 | Can you run LLM?
00:44:26.420 | Yeah, exactly.
00:44:27.420 | So now if I say LLM pipe model, for example, where you see my little one password prompt.
00:44:34.420 | Nice.
00:44:35.420 | So it's got my key.
00:44:36.420 | It's going to take a second to think about it.
00:44:38.420 | And so with each model provider, we have a default model.
00:44:41.420 | But you can also specify one.
00:44:43.420 | We can also specify one in code.
00:44:45.420 | But right now, by default, it's going to use Cloud 3.5.
00:44:48.420 | So maybe we're not going to get the best results.
00:44:49.420 | But we'll see.
00:44:50.420 | It's OG.
00:44:51.420 | Classic.
00:44:52.420 | A classic.
00:44:53.420 | Cool.
00:44:54.420 | So now I have that.
00:44:55.420 | And I can say -help.
00:44:56.420 | And we have that new develop function.
00:44:59.420 | So I can say that help develop.
00:45:02.420 | And so this is the thing we just made.
00:45:05.420 | Can you bump that up a little bit?
00:45:06.420 | Bigger?
00:45:07.420 | For sure.
00:45:08.420 | Yeah.
00:45:09.420 | Perfect.
00:45:10.420 | Thank you.
00:45:11.420 | We have an assignment.
00:45:12.420 | And that was our assignment to complete.
00:45:13.420 | We have an optional argument source, which, again, it's just going to be my repo.
00:45:17.420 | And this is going to give us back a directory.
00:45:20.420 | So here's how I use it.
00:45:21.420 | I just say develop and then do the assignment.
00:45:23.420 | So let's say develop.
00:45:25.420 | And then we didn't actually look at the project we're dagorizing yet.
00:45:28.420 | But I promise it's like a view.js website.
00:45:32.420 | So let's ask it to, I think, in here we say the example thing is to make the main page blue.
00:45:41.420 | And I'll say make the main page say hello workshop people.
00:45:47.420 | Whoa.
00:45:48.420 | Doesn't say that right now.
00:45:49.420 | And I've never run this.
00:45:50.420 | Maybe it'll succeed.
00:45:52.420 | And so now we can see this happening.
00:45:54.420 | We see our prompts getting passed in.
00:45:56.420 | We see the little person face.
00:45:58.420 | That's the prompting.
00:45:59.420 | And the little robot head of the model, which is Claude 3.5, Sonnet, saying, cool, let me do these things.
00:46:06.420 | And we can actually see it calling tools, right?
00:46:08.420 | So it's looking at the functions available.
00:46:11.420 | We see that workspace.
00:46:12.420 | Oh, it's the tools you said.
00:46:13.420 | Yeah, list files.
00:46:14.420 | Yeah, the ones that we made.
00:46:15.420 | Yeah.
00:46:16.420 | And so it figured out, okay, I can look at my files.
00:46:19.420 | Now here's this specific file I might need to edit.
00:46:22.420 | So let me read that file.
00:46:24.420 | And so it now sees the contents of this.
00:46:27.420 | And while this is running, let me just open up cloud.
00:46:30.420 | And hopefully this will load.
00:46:32.420 | So we can actually see the cloud visualization of this.
00:46:36.420 | Because it's maybe a bit easier to see.
00:46:39.420 | Because where it's, oh, I sign in.
00:46:41.420 | I'm clicking the button.
00:46:45.420 | I think my Wi-Fi is failing me on this auth flow.
00:46:49.420 | But while it's running, we'll just watch this.
00:46:51.420 | It's the same open telemetry in both places.
00:46:54.420 | So that you're getting streaming to your terminal UI and the web UI.
00:46:58.420 | So we see it call write file with some new file contents.
00:47:01.420 | And now it says, now that we've made the change, let's run the tests.
00:47:05.420 | And this is the part that really might fail on this Wi-Fi.
00:47:07.420 | Because it's doing an npm install and downloading a bunch of npm modules or node modules.
00:47:12.420 | But it should pass in a second.
00:47:15.420 | We'll just let it go and we'll talk through it.
00:47:17.420 | But we can see that our agent is actually, it wrote the files.
00:47:21.420 | And then it's writing, it's running the tests.
00:47:23.420 | Which is really awesome.
00:47:24.420 | Cool.
00:47:25.420 | So this opened up over here.
00:47:26.420 | Yeah.
00:47:27.420 | So this is, oh yeah.
00:47:34.420 | So we see it's saying like with exec npm install, with exec npm run test unit.
00:47:40.420 | If we go back to our workspace in our test function, that was part of it.
00:47:48.420 | So this is like the agent just had to call test.
00:47:51.420 | And we've defined what happens when you call test.
00:47:54.420 | And so--
00:47:55.420 | It's not like the random ones.
00:47:57.420 | Sometimes you're like, you know, make sure to test it.
00:47:59.420 | And it's like, I'm going to try a pie test with these crazy options.
00:48:02.420 | And you're like, why did you think that was going to work?
00:48:04.420 | So instead you just give it exactly what it should be.
00:48:06.420 | We could give it more flexibility in how it runs things.
00:48:09.420 | But in this case, we already know this is how you run tests in the project.
00:48:14.420 | So we just give it a test function.
00:48:16.420 | That's probably the biggest thing in creating reliable agents with Dagger is giving flexibility
00:48:23.420 | where it's important for completing tasks and removing it where you know exactly how things are meant to happen.
00:48:28.420 | So you know exactly how tests need to run.
00:48:30.420 | So it doesn't need the freedom to just run any command in a container.
00:48:34.420 | We know, okay, all you need to do is modify files and run this test function.
00:48:39.420 | And for more complex agents, maybe there's some other functions there too.
00:48:44.420 | But for this one, like, this is the amount of freedom we've given it.
00:48:47.420 | Can we, like, open another--
00:48:49.420 | Well, hold on.
00:48:50.420 | So we got cloud.
00:48:51.420 | Oh, okay, okay.
00:48:52.420 | We got cloud.
00:48:54.420 | Yeah.
00:48:55.420 | We'll get back to my pipe dream in a second.
00:48:56.420 | All right.
00:48:57.420 | So let me see if I can expand this.
00:48:59.420 | And so this is like the visibility that we want to see when we're running these agents.
00:49:03.420 | So we saw the prompts.
00:49:04.420 | And we saw the assignment is to make the main page say, "Hello, workshop people."
00:49:08.420 | Cool.
00:49:09.420 | And then--so this is the prompt we gave it.
00:49:11.420 | Now, Cloud 3.5 is looking at this and saying, first, let's look at what objects we have and
00:49:16.420 | check out the workspace, make the changes, and then run the tests.
00:49:19.420 | Sounds good.
00:49:20.420 | It runs list objects, which lets it see what it has in its environment, which is, like,
00:49:27.420 | this workspace tool, right?
00:49:29.420 | Cool.
00:49:30.420 | And then it's going to say list method, so it's going to see what it can do with the workspace.
00:49:34.420 | Like, what the heck is a workspace?
00:49:36.420 | It says it has tools to edit and test code.
00:49:39.420 | And then we expand that.
00:49:40.420 | And so this is, like, this kind of visibility into the agent's environment where we say,
00:49:44.420 | "Oh, there's this workspace write file function that gives it back a workspace type."
00:49:49.420 | And these are the arguments.
00:49:50.420 | Oh, you mean, so we didn't have to write any of the JSON kind of, you know, description
00:49:54.420 | of tools.
00:49:55.420 | It just gets generated from the functions.
00:49:56.420 | Yeah.
00:49:57.420 | So we just gave it that Dagger module, and then it all got wired up into the agent's
00:50:02.420 | environment.
00:50:03.420 | And so it says, "That's cool.
00:50:04.420 | Let me select these methods."
00:50:05.420 | So now I have these as tools to call.
00:50:07.420 | And then let's see what's in the project.
00:50:09.420 | So it's going to call workspace list files.
00:50:11.420 | And remember, the way that it does that in our workspace code was it creates, like, an Alpine
00:50:16.420 | container and runs tree.
00:50:17.420 | And so we can see the tracing of that, too, which is, like, the underlying actions of the
00:50:23.420 | tools being called.
00:50:24.420 | But we also see the return of that, which is what the agent sees.
00:50:27.420 | And it sees this whole file structure.
00:50:29.420 | Cool.
00:50:30.420 | Cool.
00:50:31.420 | And then we can see...
00:50:32.420 | It says, "Cool."
00:50:33.420 | It sounds...
00:50:34.420 | To make it say that, we should probably modify this one or this one.
00:50:37.420 | So let's see what's in those files.
00:50:39.420 | We can see it read the file.
00:50:41.420 | And that's...
00:50:43.420 | It's going to see this whole file of the word "helloworld.view."
00:50:49.420 | And it says, "Okay, I don't think that was it.
00:50:51.420 | Let's see the app.view."
00:50:52.420 | And then it reads that file.
00:50:54.420 | And then eventually it says, "I see that that app.view uses the hello world component."
00:51:01.420 | And passes a message to it.
00:51:03.420 | So now it's going to write the file.
00:51:04.420 | It's going to change app.view to pass a different message to it.
00:51:08.420 | And let's see.
00:51:09.420 | We can expand this to see the whole thing.
00:51:13.420 | Awesome.
00:51:14.420 | Nice.
00:51:15.420 | So hopefully...
00:51:16.420 | If this ever...
00:51:17.420 | If it doesn't finish, it's fine.
00:51:18.420 | So we're going to push it to GitHub in a second.
00:51:20.420 | And then GitHub can run it for us.
00:51:22.420 | But now it's running those tests.
00:51:24.420 | So this is the part that it's currently at in my shell.
00:51:26.420 | Where it's been running for like five minutes.
00:51:29.420 | So yeah, that's the visibility part I'm talking about.
00:51:31.420 | Where we can see exactly what the agent sees and what's happening under the hood.
00:51:37.420 | To be clear, right?
00:51:38.420 | So this is all running on your laptop.
00:51:41.420 | And yet, it's all inside that Dagger engine in containers.
00:51:46.420 | Totally isolated from your laptop.
00:51:48.420 | Exactly.
00:51:49.420 | So Dagger Cloud is just showing me the visualization.
00:51:52.420 | It's not running anything for me.
00:51:53.420 | This is on my machine, which is why it's still running.
00:51:56.420 | Well, right.
00:51:57.420 | And this is like because of the connection we have.
00:51:59.420 | And because of, you know, whatever the load we're putting on it.
00:52:02.420 | But the other thing to think about is it could be like we're using Python here.
00:52:08.420 | We're using Node.
00:52:09.420 | Right.
00:52:10.420 | We're using a bunch of different tools.
00:52:11.420 | So like the app is Node.
00:52:12.420 | But the workflows that Kyle's writing are in Python.
00:52:17.420 | You could have a laptop, say, or any server that just has Dagger and a connection to the internet.
00:52:25.420 | And you don't need any tools installed.
00:52:27.420 | So that's why the environments, environments is not just for the agent and developer.
00:52:31.420 | I mean, it kind of goes all the way through.
00:52:33.420 | So you could have a brand new laptop with just Dagger.
00:52:36.420 | And it would because it's using Python runtime container for the workflow he wrote in Python.
00:52:43.420 | That's just implicitly there.
00:52:44.420 | So you don't need to install Python.
00:52:46.420 | You don't need to struggle with VMs or any other versions or whatever.
00:52:50.420 | It just, it's done.
00:52:51.420 | And then inside of that somewhere, there's Node container that happened, right, in order to create this environment.
00:52:57.420 | The build env and the build and all that.
00:52:59.420 | And that, again, it's all just nested inside of there and cached and everything else automatically.
00:53:06.420 | So you can kind of just do this with a very bare bones machine set up and everything will just work.
00:53:12.420 | Yeah.
00:53:13.420 | So what we can see that we probably won't get to run this part locally just because I don't, we'll come back to it if it finishes.
00:53:20.420 | But anyway, I'll just describe this flow here where we say, okay, we're in shell.
00:53:25.420 | So that happens, like we ran that develop thing and it gave us back something.
00:53:30.420 | But now in Dagger, like I keep saying, we're in shell Dagger.
00:53:34.420 | When you type Dagger and get into that, this view, it is a shell just like bash, right?
00:53:41.420 | Where we can actually do things like create variables and chain things together.
00:53:46.420 | And so what we could do if this finished is say, okay, let's actually save that, the output of this thing.
00:53:53.420 | Because remember, it returns a directory.
00:53:55.420 | Save that to a variable called completed.
00:53:57.420 | And then we could pass that to our other functions.
00:53:59.420 | Because remember, they, they default to using our get source from our machine.
00:54:03.420 | But we could, we could pass in that optional directory to all of our functions to say, use this directory instead.
00:54:09.420 | So now I could actually run the whole thing as like a local, like I could see the results of this before even saving it to my machine.
00:54:18.420 | So let me just go over here.
00:54:19.420 | I don't know why I keep any of this folder, but we'll go to the correct directory.
00:54:28.420 | And we'll open another shell here.
00:54:30.420 | And I'll just type in part of this command.
00:54:33.420 | because what, what I can do is I can run the output from the agent as like, I can run the whole site.
00:54:40.420 | I can build it and serve it to my machine.
00:54:42.420 | And I can see what it's built before even save it back to my disk to say, yes, this is a good solution.
00:54:48.420 | So once we get this connection here, just waiting on pipes to connect to each other, and we'll, we'll let that run for a second.
00:55:01.420 | But the main thing is we can pass this around.
00:55:04.420 | We can run all of our functions with that completed directory.
00:55:07.420 | And then finally say, all right, we say export.
00:55:10.420 | That saves it back to your disk and we're done.
00:55:13.420 | So the next step is, all right, we're good with that.
00:55:15.420 | We know how to use this agent locally to ask it to make cool tasks.
00:55:19.420 | That's fine.
00:55:20.420 | But my, my people requesting features on my site, they don't have this installed.
00:55:26.420 | They don't have Docker and Dagger installed on the machine.
00:55:29.420 | They don't want to use Dagger shell.
00:55:31.420 | They just want to go to GitHub and say, make this new feature.
00:55:34.420 | So that's the next step here.
00:55:35.420 | And it sounds ambitious, but it's really quick.
00:55:38.420 | So we've got plenty of time to look at the solution here.
00:55:41.420 | And we'll look at it in Python once again.
00:55:44.420 | And so the first thing we're going to do is actually install another dependency from the Daggerverse.
00:55:49.420 | And this is my module called GitHub issue.
00:55:53.420 | And it's basically if we go to Daggerverse.
00:55:55.420 | But we saw it installed earlier when you showed us that Dagger JSON.
00:55:58.420 | Exactly.
00:55:59.420 | But that's because I skipped ahead.
00:56:00.420 | Oh, I see.
00:56:01.420 | Yeah.
00:56:02.420 | Nice.
00:56:03.420 | So if we search for that, and we have this module called GitHub issue, it's got a bunch of functions that let us do things with GitHub issues, like we can list GitHub issues in a repo, we can list the comments on a particular issue, we can write comments, we can create pull request comments, all kinds of things with GitHub issues and GitHub pull requests.
00:56:29.420 | So with this module where I've just basically used the GitHub Go SDK in this Go module to connect my Dagger functions to the API calls, I can install this in my Python project.
00:56:43.420 | And now I can have the ability to work with GitHub issues.
00:56:47.420 | And so all it needs is a GitHub token.
00:56:49.420 | And so we create, we add another function to our code called develop issue.
00:56:55.420 | So remember, we created develop, now it's develop issue.
00:56:59.420 | And all this is going to do is say we have a GitHub issue out there with our feature requests.
00:57:03.420 | We want to read that GitHub issue, give it to our agent, the agent's going to do all its things, then give us back a directory.
00:57:09.420 | We're going to take that directory and make a pull request.
00:57:12.420 | Oh, so like really similar to like the assignment that we gave it, instead, it's going to be reading the GitHub issue.
00:57:18.420 | And instead of just getting the directory back ourselves, we put the directory into a PR.
00:57:22.420 | Yep, so we can see the code here.
00:57:23.420 | Yeah.
00:57:24.420 | And so this is the entire thing here where we're not writing a new agent to do this.
00:57:28.420 | We're using our other agent.
00:57:29.420 | We're just, we're wrapping it with some other pieces to say, go here to get the assignment.
00:57:34.420 | Once it's done, put that completed work over here, which is the from here was like read a GitHub issue.
00:57:42.420 | And then we get that assignment and I can open the editor.
00:57:46.420 | So it's probably easier to see.
00:57:48.420 | Okay, so we get that GitHub issue from that issue.
00:57:54.420 | We get the assignment from the issue body.
00:57:56.420 | We pass that to our develop function because this is our agent and say, here's your assignment.
00:58:02.420 | Here's the source, which came from that same defaulted input argument.
00:58:08.420 | And then we get the issue title and URL, which is going to be really cool because then we actually, in GitHub, automatically have the new pull request linked to the GitHub issue.
00:58:22.420 | Just by having this, the body say closes this issue.
00:58:26.420 | And that's going to create a pull request.
00:58:28.420 | And so this whole thing, like you can run this part locally too.
00:58:31.420 | You don't have to run this part in GitHub.
00:58:33.420 | But it just takes the GitHub token and an issue and the repo name so it knows where to put the PR.
00:58:40.420 | And then it does that whole flow.
00:58:44.420 | But we actually want that to run in GitHub.
00:58:46.420 | And that's super easy.
00:58:47.420 | So we've made that thing.
00:58:49.420 | We just saw the code.
00:58:50.420 | Now we create a GitHub Actions workflow.
00:58:53.420 | The first two things we need to do is in the repo, we need to create two repo secrets.
00:59:00.420 | One for a cloud token.
00:59:01.420 | Again, that part's optional.
00:59:03.420 | But if you want to see all those things happen in Diger Cloud, you just put that token in the environment.
00:59:07.420 | And then whatever LLM key you're using, so the same one I used locally, is going to be in that repo secret.
00:59:13.420 | So if I go over here in my repo and I say, and I zoom out a bit so I get all the buttons, I say settings.
00:59:22.420 | We wait for the page to load.
00:59:23.420 | And then down here under Secrets and Variables, Actions, I have two repo secrets here that we just saw from that screenshot.
00:59:37.420 | Make it big again.
00:59:38.420 | Sure.
00:59:39.420 | And then there's one more thing, which is, let's see, that's how we get our Dagger Cloud token to paste in there.
00:59:48.420 | There's a little checkbox we have to press over here to let GitHub Actions create PRs, because that's disabled by default.
00:59:57.420 | So if I go under, okay, under Actions, General, and then at the very bottom, there's this checkbox, allow GitHub Actions to create and approve pull requests.
01:00:11.420 | So I've done that.
01:00:12.420 | Now I just need to create a workflow.
01:00:16.420 | And the workflow is super short.
01:00:18.420 | This is a thing you can copy paste, and I'll open it up over here.
01:00:23.420 | Under GitHub Workflows, we have Develop.
01:00:26.420 | And so now we have this GitHub Actions.
01:00:29.420 | If you ever haven't used GitHub Actions, I'll explain this real quick.
01:00:32.420 | But it's basically a CI platform, and we have, with this configuration, we tell it when events happen.
01:00:41.420 | Go do these things.
01:00:43.420 | So in this case, we say, when a GitHub issue is labeled, and the label is called develop, then run this command.
01:00:52.420 | And this command is the Dagger call develop issue with those arguments, like GitHub token, the issue ID, and the repo.
01:00:59.420 | And these things are all coming from GitHub Actions automatically.
01:01:02.420 | So, like, the environment's GitHub token is created here, where we say, this command needs a GitHub token with permissions to write contents.
01:01:14.420 | Contents are, like, commits to your project.
01:01:17.420 | Read the issues and write pull requests.
01:01:20.420 | And so we've put that in the environments.
01:01:23.420 | We've given it the API key for our LLM and the cloud token.
01:01:27.420 | And so now, just by running this Dagger call, that connects the dots where GitHub Actions, whenever we create that label, is going to run that Dagger function.
01:01:36.420 | And that Dagger function has all the capabilities to run the agents and open a PR.
01:01:40.420 | So that's, like, us in the Dagger shell when we call, when we are running, like, the develop function or some other build function or whatever.
01:01:48.420 | This is just having GitHub Actions run the develop issue function for us.
01:01:53.420 | Why are you having GitHub Actions do it?
01:01:55.420 | So that it can go into the issues?
01:01:57.420 | So we're having GitHub Actions do it because we want this flow to be automated inside GitHub.
01:02:02.420 | So I'll show the flow real quick.
01:02:03.420 | But it can run anywhere.
01:02:04.420 | So you can run it.
01:02:05.420 | It doesn't matter.
01:02:06.420 | It doesn't matter.
01:02:07.420 | It doesn't matter where it runs.
01:02:08.420 | So this just happens to be GitHub Actions because we're already in a GitHub repo.
01:02:13.420 | It's free because this is, like, we're not using any crazy compute to run this thing.
01:02:19.420 | And most of the hard stuff is happening on your LLM that you're paying for somewhere else.
01:02:22.420 | And they have better internet connection at GitHub than we did today.
01:02:24.420 | Exactly.
01:02:25.420 | So let's say -- let's create a new issue.
01:02:29.420 | And we'll say change the greeting.
01:02:32.420 | And we want to -- what did we ask for before?
01:02:35.420 | We asked for, like, make the main page say --
01:02:39.420 | Hello, workshop.
01:02:40.420 | Something like that.
01:02:41.420 | Hello, workshop people.
01:02:44.420 | Okay.
01:02:45.420 | So we'll create this GitHub issue.
01:02:48.420 | And remember, this whole thing kicks off when I add the label develop.
01:02:52.420 | And so I've already run this on this repo and obviously made a typo as well at one point.
01:02:57.420 | But if you don't have it there, you can just say foo.
01:03:00.420 | And you'll have a button to say create a new label develop.
01:03:04.420 | So we want to call it develop.
01:03:06.420 | So I click that.
01:03:09.420 | And now my issue has been labeled.
01:03:12.420 | And so now that kicks off GitHub Actions to call my dagger thing.
01:03:15.420 | So let's go over here in the Actions tab.
01:03:18.420 | And we should see something running.
01:03:21.420 | And it says change the greeting.
01:03:23.420 | And we can watch this run over here.
01:03:25.420 | We can also pull it up in cloud.
01:03:27.420 | Because remember, I put that cloud token in there.
01:03:29.420 | Because this stuff is all too hard to see flying by my screen in real time.
01:03:33.420 | So let's go back here.
01:03:35.420 | And this is GitHub Actions, right?
01:03:37.420 | But it could be any kind of, you know, orchestration.
01:03:40.420 | A CI orchestration could be Jenkins.
01:03:42.420 | Could be GitLab CI.
01:03:43.420 | It could be anything.
01:03:45.420 | Azure DevOps.
01:03:46.420 | You know, whatever.
01:03:47.420 | Whatever you got.
01:03:48.420 | Yeah, question.
01:03:49.420 | How much, if any, like prompt modification for you guys?
01:03:52.420 | Is it literally just what's in that one markdown file?
01:03:55.420 | Or do you add, like is it aware that it's in Dagger?
01:03:58.420 | It is, yeah.
01:03:59.420 | So we have, the question is like how much prompt modification does the agent have?
01:04:04.420 | Dagger has its own system prompt that it adds.
01:04:06.420 | That kind of guides it towards like how you use tools within Dagger.
01:04:10.420 | So it knows like call the select methods and list functions and those things we saw it doing.
01:04:17.420 | You can add more to the system prompts.
01:04:18.420 | You can get rid of that system prompt if you want to.
01:04:20.420 | But yeah, there is a default one.
01:04:22.420 | Yeah.
01:04:23.420 | You have to make further edits because the agent is not able to develop the right code analogies.
01:04:30.420 | Yeah.
01:04:31.420 | How do we correct after the developer before the rest of the stuff starts?
01:04:35.420 | Yeah.
01:04:36.420 | So if the agent does something, if it calls develop and it runs and it produces something that we say,
01:04:40.420 | okay, that's not right.
01:04:41.420 | How do we go back and say, make these changes?
01:04:44.420 | Can we just edit the completed source?
01:04:49.420 | Oh yeah.
01:04:50.420 | So yeah.
01:04:51.420 | So you can edit the completed source if you want.
01:04:54.420 | If you see the source and say, oh, it needs one more change.
01:04:57.420 | Or I can show you another function where we say, we have an ability to give it more feedback
01:05:02.420 | to say, okay, you've done this so far.
01:05:05.420 | Here's some more changes to make because you didn't get it quite right.
01:05:09.420 | And so we'll see that happening.
01:05:11.420 | Yeah.
01:05:12.420 | Go ahead.
01:05:13.420 | The test directory.
01:05:26.420 | I think it should.
01:05:28.420 | Yeah.
01:05:29.420 | So the question was giving the agent access to the test directory.
01:05:33.420 | I think in tests, it runs that.
01:05:36.420 | And I think in our workspace, we just give it the full source.
01:05:40.420 | We give it the full source of the repo.
01:05:42.420 | So it could get down in there if it wanted to.
01:05:45.420 | Yeah.
01:05:46.420 | I think it's kind of a funny thing, like making sure the tests pass, because sometimes if the agent broke the test,
01:05:52.420 | it'll go change the test.
01:05:54.420 | And sometimes that's correct.
01:05:55.420 | Right?
01:05:56.420 | Sometimes we actually change the behavior and the tests need to be updated.
01:05:59.420 | But maybe more often, that's not correct.
01:06:01.420 | So you might want to maybe have that as part of your prompting or part of your validation.
01:06:05.420 | Say, make sure the agent didn't change the test.
01:06:08.420 | Or it's kind of tough to decide, like, whether that's correct or not.
01:06:13.420 | Yeah.
01:06:14.420 | I noticed that there wasn't a Dagger install step, because I've got behind this little action.
01:06:20.420 | So in --
01:06:21.420 | It's a one-liner.
01:06:22.420 | Yeah, yeah.
01:06:23.420 | So in our workflow, we installed Dagger.
01:06:27.420 | But it's really just -- there's a Dagger for GitHub action.
01:06:31.420 | And so we just said --
01:06:32.420 | What, three-liner in the --
01:06:33.420 | Three-liner, yeah.
01:06:34.420 | So we said this version of Dagger.
01:06:36.420 | But this installs Dagger in your GitHub actions runtime, basically.
01:06:41.420 | So we used Checkout to check out a repo.
01:06:43.420 | And then this to install Dagger.
01:06:46.420 | The dependencies, like when you install the dependencies,
01:06:50.420 | like you start with them.
01:06:54.420 | Oh, yeah.
01:06:55.420 | Does that happen automatically?
01:06:56.420 | Yeah, exactly.
01:06:57.420 | So this is -- in our Dagger JSON, we have all of our dependencies listed.
01:07:04.420 | And so you don't have to say, like, Dagger install or anything.
01:07:07.420 | When we say Dagger install, it adds it to this.
01:07:10.420 | And then we just run it.
01:07:11.420 | Yeah.
01:07:12.420 | We don't have to do anything like NPM install like that.
01:07:15.420 | It just -- it knows to make sure your client's generated.
01:07:19.420 | But that is the nice thing about having those dependencies, you know,
01:07:24.420 | in the -- in a file saved in Git, you know, alongside the project.
01:07:29.420 | So because, like, what we've done essentially, like,
01:07:32.420 | when we first got this project, this view app project,
01:07:35.420 | it didn't have any Dagger.
01:07:37.420 | It didn't have anything, right?
01:07:38.420 | It was just, like, an app that you could run.
01:07:40.420 | And then we said, oh, well, let's Dagger init in this thing.
01:07:43.420 | And then we got that little dot Dagger
01:07:45.420 | where we started developing our build and test functions, right?
01:07:48.420 | Kind of like our tools for development or for CI, just alongside.
01:07:52.420 | And then in there is where we've been installing more modules,
01:07:56.420 | like the workspace, the GitHub issues module, like anything else you would need.
01:08:02.420 | So now -- and that's all in Git.
01:08:04.420 | So the thing's now like this fully loaded, like, Daggerized project.
01:08:07.420 | So it's kind of carrying around its own tools on its back for working, you know,
01:08:13.420 | just for a developer to use, or platform engineer to use, or for an agent to use.
01:08:19.420 | Yeah, we're just, like, waiting for things to load here.
01:08:22.420 | Yeah, go ahead.
01:08:23.420 | Have you gotten anything, like, Dagger and Dagger,
01:08:27.420 | where you have it spinning up, like, agent fleets, maybe different roles?
01:08:31.420 | Yeah.
01:08:32.420 | Yeah.
01:08:33.420 | Yeah.
01:08:34.420 | So that's -- I mean, myself as someone that builds a lot of Dagger code,
01:08:37.420 | I have agents that need to write Dagger code.
01:08:40.420 | And to reliably validate those things, they need basically Dagger inside of Dagger.
01:08:46.420 | So that's exactly, like, a thing that you can do, and I can even pull up if we go to --
01:08:52.420 | we're a bit short on time, but we're basically done with that guide.
01:08:55.420 | Yeah, 11 minutes.
01:08:56.420 | Just waiting for it to run, yeah.
01:08:58.420 | But we have an examples thing here on the docs.
01:09:03.420 | And there's tons of examples here.
01:09:05.420 | But one of the really cool ones that I like the most because --
01:09:08.420 | I wrote it.
01:09:10.420 | Is --
01:09:11.420 | I thought you were going to show mine, but that's fine.
01:09:12.420 | Let's see.
01:09:13.420 | No, it's fine.
01:09:14.420 | It's fine.
01:09:15.420 | The --
01:09:16.420 | Oh, it's not -- okay.
01:09:19.420 | We're going to add to the list of examples.
01:09:20.420 | It's going to be an even cooler list soon.
01:09:22.420 | So we have --
01:09:23.420 | We'll get your question next.
01:09:24.420 | Yeah.
01:09:25.420 | There's this repo under my GitHub, kpenfound/daggerprogrammer.
01:09:28.420 | And this thing is something I use to -- like, in the docs, we saw those tabs of all the different
01:09:34.420 | languages.
01:09:35.420 | And so every -- whenever I write a new guide, I have to have it in five languages.
01:09:38.420 | And so this agent can take it in one language and produce all the languages.
01:09:43.420 | And that's just an agent that knows how to write Dagger.
01:09:46.420 | And so to do that, it has a lot of cool things in addition to be able to, like, run Dagger
01:09:52.420 | and Dagger.
01:09:53.420 | So if we look at the code for that, it's just like --
01:09:56.420 | This one's in TypeScript.
01:09:57.420 | Yeah.
01:09:58.420 | This is the TypeScript one.
01:09:59.420 | And when it runs tests, it runs the Dagger thing.
01:10:04.420 | And there's this flag privilege nesting so that the inner container can talk to the engine.
01:10:10.420 | And this is writing Dagger code, basically.
01:10:13.420 | Yeah.
01:10:14.420 | Question here.
01:10:15.420 | How does this relate to MCP?
01:10:18.420 | And will you use Dagger to implement MCP servers?
01:10:21.420 | And is there some overlap?
01:10:22.420 | Because you have all these modules, which maybe you could imagine having multiple MCPs as a
01:10:27.420 | different mechanism.
01:10:28.420 | Yeah, absolutely.
01:10:29.420 | So one way to think about it is we were kind of doing this thing with Dagger modules before MCP came
01:10:36.420 | on the scene.
01:10:37.420 | And then obviously we're like, oh, this is super aligned with the way we think things should
01:10:42.420 | be in a lot of ways.
01:10:44.420 | And so you can today even take a Dagger module and you can say Dagger dash M, the name of
01:10:51.420 | the module, MCP.
01:10:53.420 | And so you can expose a Dagger module as an MCP server, for example.
01:10:58.420 | And yeah, and we've got some more things that we'll be probably sharing soon about that kind
01:11:03.420 | of stuff.
01:11:04.420 | But yes, we think the vision is compatible in that way.
01:11:07.420 | And yeah, you can use the MCP ecosystem as well.
01:11:12.420 | Yeah, so there's a few different layers here, right?
01:11:14.420 | There's within our agent that we just built, we installed modules and that uses basically
01:11:20.420 | our internal implementation of MCP to talk between modules within Dagger.
01:11:24.420 | But you can also take a Dagger module, expose it as an MCP server, and then in, I don't know,
01:11:31.420 | the near future, next week or something, you can connect to external MCP servers to bring
01:11:36.420 | them into Dagger as well.
01:11:37.420 | Yeah.
01:11:38.420 | I mean, I wanted to be clear, like the internal, the internal implementation, it's before MCP.
01:11:44.420 | So it's not MCP per se, but it's very much logically, you can think of it in a similar way.
01:11:50.420 | Yeah.
01:11:51.420 | And because you can expose everything as MCP servers, it ends up being practically, you know,
01:11:56.420 | very, very much the same for users.
01:11:58.420 | Check it out.
01:11:59.420 | We got our PR.
01:12:00.420 | What?
01:12:01.420 | Finally.
01:12:02.420 | Oh, we got a PR?
01:12:03.420 | So we got our PR open, says, make the main page say that, closes that issue we created.
01:12:08.420 | We have that commit pushed up and we see the user is this GitHub action spot.
01:12:12.420 | And we have on the welcome.view, it changed from documentation to, so maybe that's right.
01:12:20.420 | Oh, it deleted this other thing too, because it decided that's not needed.
01:12:23.420 | Yeah.
01:12:24.420 | Cool.
01:12:25.420 | So we have a really cool agent.
01:12:26.420 | Yeah.
01:12:27.420 | It needs lots of vibes.
01:12:28.420 | It's just vibing back there, you know, the agent's just like, yeah.
01:12:31.420 | But yeah, the main thing is we were able to get it to run in GitHub.
01:12:34.420 | So I was able to request that feature, and it ran hands-free.
01:12:38.420 | And now...
01:12:39.420 | Yeah, exactly.
01:12:41.420 | So right now, we only built in the one thing where it says we create an issue that's a feature
01:12:46.420 | request.
01:12:47.420 | But if we look at, I think, on this examples list, we have this one, this Greetings API,
01:12:55.420 | which is my main, like, demo project.
01:12:57.420 | And it has a ton of stuff in it.
01:12:59.420 | There's, like, five different agents in here.
01:13:01.420 | And one of them is I want to give feedback on a PR.
01:13:04.420 | And so we can probably open one of these.
01:13:09.420 | And I say, I give it some feedback.
01:13:12.420 | I say slash agent, add this other...
01:13:16.420 | So this one, the original one is, like, make a new endpoint for my API.
01:13:20.420 | And then it did that.
01:13:21.420 | And then I say, okay, here's some feedback.
01:13:23.420 | The endpoint should be authenticated.
01:13:25.420 | And then it picks up again and pushes some new changes.
01:13:28.420 | And then I have another agent where I say slash review.
01:13:30.420 | And that will create a review for my PR with any other changes that I need.
01:13:34.420 | And then I can say, okay, make those changes.
01:13:36.420 | And then also, please don't delete all the tests.
01:13:39.420 | Very important to add.
01:13:40.420 | And that could be, like, you don't have to be inserting yourself at every one of those points,
01:13:45.420 | right?
01:13:46.420 | Yeah.
01:13:47.420 | Because it's great for when we're showing people...
01:13:48.420 | Yeah.
01:13:49.420 | If you want an example of how you can take what that workshop just built to the next level,
01:13:53.420 | where you have all this feedback and more advanced things, this is a great repo to look at, this
01:13:57.420 | Greetings API.
01:13:58.420 | Because it has all of these different agents doing tons of things.
01:14:01.420 | It even has one where if we look at...
01:14:04.420 | If I, as a human, push up a broken thing, because we still have humans developing stuff sometimes,
01:14:10.420 | right?
01:14:11.420 | So I pushed a broken thing and the test failed, which is super annoying because I, you know,
01:14:17.420 | I skipped running tests because I didn't have a good prompt that told me to run tests three times.
01:14:21.420 | This agent can actually look at the test failure automatically and propose a fix for that,
01:14:27.420 | that I can just click on it and fix that test change, right?
01:14:31.420 | So this is all stuff in this demo repo where you can see, like, how to build all these things yourself.
01:14:37.420 | There's a question over here.
01:14:39.420 | There's a lot I really love here.
01:14:41.420 | I just had a question almost getting at the motivation for some of this stuff.
01:14:45.420 | It feels like there's a world where Dagger could have really prioritized just, like, the containers,
01:14:48.420 | the workflows and let you just bring your own AI agent.
01:14:51.420 | Like, what's the motivation behind making it its own primitive and going down that path?
01:14:55.420 | I think there's a lot of levels to it, right?
01:14:57.420 | Like, if you're already really baked into, like, Pydantic or OpenAI Agents SDK,
01:15:02.420 | you can still use those container workflows in that.
01:15:05.420 | And I'll show it.
01:15:06.420 | Maybe I shouldn't.
01:15:07.420 | But I have...
01:15:08.420 | It's so crazy.
01:15:09.420 | If you've done the OpenAI Agents quick start, if it loads here...
01:15:15.420 | Or, sorry, this is the agent quick start we have, but with the agent SDK, where I've used
01:15:22.420 | the OpenAI Agent SDK that says, like, here's my completions model.
01:15:26.420 | This is actually using OLAMA.
01:15:28.420 | This is what their SDK looks like.
01:15:31.420 | But in that SDK, I'm actually still using Dagger.
01:15:34.420 | So I actually recreated that same workspace where you have read file, write file, and build.
01:15:40.420 | But I've created that with Dagger inside of the OpenAI Agents SDK.
01:15:44.420 | So I'm using their agents, but using Dagger code for the containers.
01:15:49.420 | The main thing is, like, this, I had to write all of these tools and how to use them.
01:15:59.420 | If it's all within Dagger, you get that cool thing where we have that whole Dagger versus
01:16:02.420 | modules.
01:16:03.420 | I can just plug one in, and that's just given to the agent, right?
01:16:06.420 | Yeah, your whole method signature is instantly translated into the right form with tools, right?
01:16:12.420 | You get tools for free, as well as functions.
01:16:15.420 | And, yeah, and we do have some people in our community that are using Dagger.
01:16:19.420 | They're, like, with Pydantic and other things where they're just, like, they want this sandbox
01:16:23.420 | capability because they're, like, oh, I don't want to, you know, they don't want to use another
01:16:27.420 | cloud sandbox vendor or whatever.
01:16:29.420 | I want to have it locally, but I don't want it on my computer in my file system either.
01:16:33.420 | I want containers.
01:16:34.420 | I want it easy.
01:16:35.420 | So they're, so, yeah.
01:16:37.420 | But I think, yeah, the sweet spot is kind of doing it all because it just harmonizes really well.
01:16:41.420 | Yeah.
01:16:42.420 | Question there.
01:16:44.420 | Thanks for the great demo.
01:16:45.420 | So I had a question.
01:16:46.420 | Let's say if I want to build an agent for programming HTML games.
01:16:51.420 | Which run in browser.
01:16:52.420 | So for that game building agent, I would need the testing, so the running and testing
01:16:57.420 | environment to be browser.
01:16:59.420 | So does Dagger have those sort of constructs, like, let's say if I want to spin up a browser
01:17:04.420 | environment and then do some kind of automation in that for testing that game which the LLM might
01:17:09.420 | have written.
01:17:10.420 | Yeah.
01:17:11.420 | I mean, you certainly can.
01:17:12.420 | I mean, I've done, I've done some headless browser stuff.
01:17:13.420 | I've also done some browser stuff and then connect over VNC or different.
01:17:18.420 | Yeah.
01:17:19.420 | You can do a lot of, you know, you can do, you can do a lot of stuff with Linux containers.
01:17:25.420 | So, yeah, we should talk about it.
01:17:29.420 | You should come in the community.
01:17:30.420 | Let's, like, do it.
01:17:31.420 | Yeah.
01:17:32.420 | Great demo.
01:17:33.420 | And thanks for compressing a lot of information.
01:17:37.420 | So, is my understanding that you build CI/CD infra and all these things once and then let Dagger
01:17:44.420 | do the asynchronous job with guardrails and, you know, all the things in place?
01:17:49.420 | Like, is my understanding that Dagger is sort of this asynchronous AI agent that does things
01:17:54.420 | on its own but with guardrails, not just leaving a cloud code or something in a trust-all mode
01:18:00.420 | and then let it do its thing?
01:18:02.420 | Is that right?
01:18:04.420 | I think, yeah, so the question is, like, yeah, what is Dagger in a certain sense, too?
01:18:09.420 | But Dagger gives you this platform to create these software engineering workflows that can
01:18:14.420 | be used for shipping software, that can be used for developing software, you know, and the environments
01:18:18.420 | that we saw.
01:18:19.420 | And then you can use them as a platform engineer or as a developer, but then you can also hand
01:18:24.420 | them off to agents.
01:18:26.420 | And so we think that's really powerful, the fact that you can use that same platform to do
01:18:30.420 | all those things and to create those guardrails, like you say.
01:18:33.420 | Got it.
01:18:34.420 | You can.
01:18:35.420 | The one thing I want you to show, and you've got one minute, can you just show your terminal
01:18:38.420 | and just let's get vibey for just one second.
01:18:41.420 | So you're connected to an LLM right now, right?
01:18:44.420 | So go ahead, just like, let's talk to this LLM.
01:18:47.420 | So it turns out that we've been using the shell mode, which lets you, you know, kind of like
01:18:53.420 | very declaratively say, like, I want container from Alpine with this file and give me a terminal
01:18:58.420 | into that or whatever.
01:18:59.420 | whatever.
01:19:00.420 | Now, what we've done is we just had, we're like, we're chatting now directly with the
01:19:05.420 | connected LLM.
01:19:06.420 | And this LLM can see all the Dagger objects you have.
01:19:10.420 | So another way you can use Dagger is you can just say like, all right, I'm just going to
01:19:14.420 | create this container.
01:19:15.420 | And I'm going to say, hey, LLM, you see that container?
01:19:18.420 | Why don't you write some software in it?
01:19:20.420 | So you can get that kind of, that kind of workflow going too.
01:19:24.420 | So he's, there you go.
01:19:25.420 | So he's actually saying like, hey, give me a Python container.
01:19:28.420 | And so it's going to actually look and see what methods exist in the Dagger API.
01:19:34.420 | And it's, oh, there's this container method in the API, which we were using earlier.
01:19:38.420 | And then it's going to like, you know, decide, oh, I'm going to use container, maybe from container
01:19:43.420 | from container with exec to execute.
01:19:45.420 | So these are just, it's exploring the Dagger API right now.
01:19:49.420 | And now it's going to like, it's actually pulling a Python 3.11 container, then it can do things
01:19:54.420 | with that.
01:19:55.420 | like that.
01:19:56.420 | So, you know, it's actually using containers, like kind of like computer use or something
01:20:00.420 | like that.
01:20:01.420 | But so yeah, so you can get, you can go, we didn't even show that side of it.
01:20:06.420 | Because, you know, we're trying to show the guardrails.
01:20:09.420 | But you can also use it in this kind of a style too.
01:20:12.420 | Got it.
01:20:13.420 | And one, one follow up question.
01:20:15.420 | Typically, LLMs are good at small to medium tasks.
01:20:18.420 | And that's what we have seen, like a small to medium task here.
01:20:21.420 | How good is Dagger at orchestrating like a large task, which needs design or some user
01:20:28.420 | input or, you know, multi turn prompt, like, you know, not a small medium task, but a large
01:20:33.420 | task.
01:20:34.420 | How good is Dagger with that?
01:20:35.420 | Yeah, the question is like, size of tasks that Dagger is good for.
01:20:38.420 | I think if you make it, if you decompose things down and you can architect things right, it can
01:20:43.420 | handle a lot of different sizes.
01:20:45.420 | And we should, I know we're at time now.
01:20:46.420 | So we're going to like, we're going to end here.
01:20:48.420 | We'll take some more questions outside the room.
01:20:49.420 | But in the hall, for sure.
01:20:50.420 | Thank you so much for everybody that attended.
01:20:51.420 | Thank you guys.
01:20:52.420 | Thank you very much.
01:20:53.420 | Thank you.
01:20:53.420 | Thank you.
01:20:53.420 | Thank you guys.
01:20:54.420 | Thank you.
01:20:55.420 | Thank you.
01:20:55.420 | Thank you.
01:20:55.420 | Transcription by ESO. Translation by —