back to index

Claude plays Minecraft!


Whisper Transcript | Transcript Only Page

00:00:00.000 | Today I'm going to show you how I built an agent to play Minecraft, right? So this talk,
00:00:22.440 | in a nutshell, if I could sum it up, is all about agentic workflow. So I'm sure some of you here have
00:00:29.760 | built agents, you're in this track, maybe some of you have not. So I'm going to start
00:00:35.040 | with a bit of a level set on agent workflow and agentic workflow, and then I'll show you
00:00:41.520 | how that maps to the agent we built with Minecraft. So this flow diagram is probably familiar to
00:00:50.720 | some of you, where an agent will take inputs like chat or some unstructured data. It'll
00:00:59.520 | consume that and then it will use a set of tools, one or many tools to satisfy the request
00:01:06.000 | from the chat. So it can fulfill its destiny and its actions. And also it uses an LLM and it may
00:01:15.600 | then orchestrate one or many times through the LLM, the tools, and then return a response.
00:01:22.800 | Pretty well understood now, but very cool. So if we take that into a Minecraft world, where does the chat
00:01:32.320 | come from? Well, if anybody here played Minecraft? A few people, a few people. Great. So if you've
00:01:39.280 | played Minecraft, you'll know there's a chat function where you can interact with the world and other
00:01:45.360 | people in the world. And so that's where the chat comes for this agent. The tools in Minecraft are many,
00:01:53.920 | and that's why we built this. In Minecraft, you can do many things. You can build, you can dig,
00:02:01.040 | you can farm, you can slay a pig if you so desire, but you can do a few things.
00:02:11.840 | And so the agent itself becomes the bot. So we tell it via chat what we want it to do. It uses the tools
00:02:20.800 | and hopefully fulfills our requests. Now, remembering that our requests are probably indeterminate
00:02:28.240 | and the output is also indeterminate. So today I'm going to live demo what you should never do. So they
00:02:35.040 | say you should never live demo with kids, animals, and an LLM. So I'm going to do that today.
00:02:40.640 | The LLM itself, of course, is magic. It is. It's magic. And then the responses, the agent will
00:02:51.120 | do one or many thoughts and then return the response to Minecraft. So that's the flow in a nutshell.
00:03:00.960 | And of course, we have our friendly bot. And so our bot's name is Rocky, in short for Bedrock. And we'll
00:03:10.400 | see what Rocky, if Rocky behaves today. And the tools we're going to use with Rocky and the actions that
00:03:18.160 | those tools will jump on is jump, move to position, locate a player, and a few more actions that we've built.
00:03:28.080 | And at this stage, I'll also mention that this demo that I'm going to show is open source.
00:03:33.520 | And you can all have a play with it. At the end, all you need is a Minecraft client.
00:03:38.080 | So this is Rocky. I'm going to give you a quick recording that I did yesterday at the booth. We're
00:03:45.120 | at the AWS booth. If you want to come and chat to me, I'm not scary. I am from Australia, but I'm not
00:03:51.600 | scary. So Rocky does a number of things. So this is what Rocky will look like. At the minute, it's
00:03:57.920 | raining in Rocky land. Rocky is a very friendly bot. And so you can ask Rocky questions in the chat. I hope
00:04:04.080 | you can all see that. So what's the weather? So Rocky being friendly will know my name and then also tell
00:04:12.720 | you what the weather's like. So it's really in your area. You might want to take some shelter.
00:04:16.960 | Rocky can also do some actions. So this is how we send the actions to the agent via chat.
00:04:23.200 | So Rocky jumps for us. Also, Rocky can do other things. So I'm going to hide behind this
00:04:30.400 | Rocky constructed building. And I'm going to ask Rocky, can he come and find me? Or can they come and find me?
00:04:36.880 | Actually, Rocky is neither male nor female. Rocky says on my way. And there, there comes Rocky.
00:04:44.160 | And Rocky's found us. But what else can Rocky do? So what else did I do here? Oh, it can find things
00:04:54.160 | in the world. So Rocky here, I've asked it to find a pig. So Rocky knows based on the plugin that we're
00:05:02.560 | using for Minecraft where everything is. And it says, yep, I found a pig. This is the location.
00:05:07.760 | And I then ask it, go find, go, go to that pig. So off Rocky runs to the pig and finds the pig.
00:05:14.640 | So we've made, we've demonstrated this at a lot of conferences. And the biggest thing we've seen
00:05:21.120 | is people try to hit the pig. So I asked it to hit the pig. And there it goes. Rocky hits the pig.
00:05:28.800 | You could do that multiple times. And lots of people have observed do. Don't know why. But hey,
00:05:33.120 | human behavior is even more fascinating than LLMs. So now I'm going to get Rocky to come here. And I'll
00:05:40.880 | probably finish with the dig action. So I'll ask Rocky to dig a hole. Notice when I say dig a hole,
00:05:50.080 | there's a parameter that I add. So I think in this instance, yeah, a two by two hole. So this is a
00:05:56.560 | parameter that I'm sending to the action. I'll show you how all that works when I do a bit of a deep
00:06:01.600 | dive into the back end. So Rocky's digging a hole. And also Rocky can come find us out of that hole and
00:06:11.360 | dig their way out of the hole. This is behavior that we didn't expect. And it just, just works,
00:06:17.680 | which is really fascinating. So that's, that's a bit of a three minute demo of Rocky. I'm going to live
00:06:24.960 | demo Rocky once I go through how I built it. So the architecture. So we built this for a serverless
00:06:31.360 | conference. So our constraints where needs to run in a managed environment, a service environment,
00:06:39.600 | and be cool. So we started off with running it in the container. So Minecraft, you can run in the
00:06:46.480 | container. And you can also run mine flare, which is the bot framework that works with Minecraft. And
00:06:53.040 | those two run side by side nicely on the container. We can't run mine flare on Lambda, which is serverless,
00:07:00.240 | because we need state, right? So the state is stored in mine flare.
00:07:04.080 | We started off with LangChain on Lambda to build this. Like most people, when they do start building
00:07:10.720 | agents, they probably start with LangChain. I'm not a big Python developer. I am an engineer, but not
00:07:16.720 | really with Python. But we got it working to a state. But then as we added complexity and more and more
00:07:23.360 | actions and tools for Rocky to use, it got really, really complex. And then we also used
00:07:29.920 | Amazon SageMaker to host the LLM, which in this case, we were using the Cohere LLMs.
00:07:36.240 | And then we decided, okay, so that's not service enough. Let's use agents for Amazon Bedrock,
00:07:46.160 | keep the Minecraft server and client, which is MindFlare on Amazon ECS, and build that agent up on Amazon Bedrock.
00:07:55.120 | And then for architecture. So what we're doing and what I'm demonstrating today,
00:08:01.440 | it's running on my local machine. So both Minecraft and the MindFlare client. And then it's into calling out to
00:08:10.240 | agents for Amazon Bedrock and may the internet always be favorable over the next couple of minutes.
00:08:17.520 | So if you haven't come across agents for Amazon Bedrock, let me do a quick
00:08:21.440 | overview. Again, AWS booth is there. Come talk to me. I'll go a deep dive for you.
00:08:26.240 | But in a nutshell, Amazon Bedrock is all about facading one or more models. So what it does is it produces a
00:08:35.120 | common API that then you can use as an engineer to build out your application and add features. So
00:08:42.480 | we host as well as Amazon models, we host Anthropic models, we host Cahere models, as I mentioned,
00:08:48.640 | Lama models, and a whole host of other models. So when building out Rocky now using agents for Amazon
00:08:56.800 | Bedrock, you'll see that I used the Claude models. And I'll go through that as well.
00:09:03.280 | Why I did that, in particular Claude Haiku, because it's fast. And so as well as hosting models, not hosting models,
00:09:11.680 | but providing a facade into models, everything, you can also build additions to that. And one of those additions
00:09:19.760 | is agents, as well as knowledge bases, where you can do RAG, and also guardrails and evals. So all of these things are
00:09:27.280 | baked into Amazon Bedrock. And it's think of it more as sort of a managed agentic workflow, right? So
00:09:35.840 | you can manage RAG and manage agents as well. So it's all in one spot. So it also helps you with prompt
00:09:43.520 | creation. It's obviously, as it's an agent, orchestrates multiple tasks, and also allows you to trace
00:09:49.840 | through the chain of thought of the agent. So you can either do that in the console, or we'll spit it
00:09:54.560 | out to logs. And of course, it also, and what was very key for this demo, it has return of control.
00:10:02.800 | Okay, so that's the slides. Let's jump into what we've got.
00:10:06.720 | So let me see if I can just, instead of doing that.
00:10:13.040 | Mirror. Nope, not that.
00:10:27.440 | Oh, dear. Still mirroring. Okay, there's always, so.
00:10:37.680 | Me and display. Extend the display. Me and display, I want to mirror. That's not, that's not letting me.
00:10:54.080 | Apologies. You got this. You got this. I got this.
00:10:58.320 | And mirror.
00:11:06.960 | Mirror. There we go. Yay. Okay, so.
00:11:13.280 | Okay, so this is, that's MindFlare. So it's all open source. Let's first of all start with the good stuff.
00:11:21.760 | So open source, MindFlare, you can use. And then this is the, what we've built. Now I'll share this
00:11:29.840 | at the end, QR code, everything. Don't worry about it. So this is Amazon Bedrock. So this is the agents,
00:11:35.600 | and this is the, the console page. So what it'll do, you can go down here into agents. If you haven't seen
00:11:44.320 | Bedrock before, it'll show you the model access and what models you've got access to. But I'm really
00:11:48.960 | talking about agents, which is down here in the left. So here I've got a Minecraft agent. Apologies
00:11:53.760 | if you can't see this too well. Let me see if I can make it a little bit better. You've got Minecraft,
00:11:59.280 | the Minecraft agent. So I've defined this in here. Now, for all the real engineers in the room,
00:12:04.240 | you don't have to build it by click ops. So you can build it all as infrastructure as code. And if you
00:12:08.640 | check out the GitHub repo, you will see it all in CloudFormation and CDK. So once I've got into the agent,
00:12:14.800 | you can see that here I can select my model. So at the minute, this agent is using Cloud,
00:12:20.000 | Cloud3 Haiku. And here is the prompt. This is the system prompt for that agent. You're a playful,
00:12:25.200 | friendly and creative Minecraft agent called Rocky. And your goal is to entertain players and collaborate
00:12:30.640 | with them in a fun gaming experience. And it goes on. So this prompt we built over time. There's a bit
00:12:36.160 | of prompt engineering going on in this demo. And then if I go into the actions, you can see all of
00:12:42.720 | the actions. So you notice all of the actions that are defined all have returned control because we want
00:12:49.120 | them to return their output. So we've got jump. We've got other actions. Dig. Dig, you'll see this is
00:12:58.000 | where the parameters come in. So I've got a depth and a width. So you remember when I specified a hole,
00:13:03.360 | I said two by two. Also, if I say small hole, it will go as a one by one hole. So I'll infer that
00:13:08.720 | using the model. Also, there's the action. Is it raining? So all of these actions are defined in here.
00:13:16.560 | And also, I've got another action set that I'm going to share with you. I'm going to demo is called
00:13:27.360 | Minecraft experimental, right? And what this does, this has one action called build, which is a very
00:13:35.200 | complex thing to do in Minecraft, believe it or not, because it's a 3D space. And so the action for build,
00:13:42.080 | all I do is build a structure. And that returns back to the client. And I'll show you how that prompts
00:13:49.200 | actually created when I do the build. So that's how it's built. And the actual source code itself is here.
00:13:56.960 | And the client is stopped. So I'm going to -- this is Rocky itself. So I'm going to start Rocky.
00:14:03.920 | Going to start. And bot spawned. Going to go over to Minecraft. We've got running here.
00:14:14.400 | Back to game. Rocky's joined the game. There's Rocky. Let me just go set. Time. Time. It's time set.
00:14:23.760 | Time. Set. Noon. So you can see it better.
00:14:26.640 | Okay. Okay. So there's Rocky. So you can -- let's just test if Rocky's working. So T, please come here.
00:14:41.440 | Rocky will make the first request. And there we go. So Rocky is working for us. So let's make sure that
00:14:48.880 | Rocky can do something. So we go, "Rocky, please, please dig a small, small hole." Okay. So Rocky will dig a small hole.
00:15:02.880 | So now I'm going to use the experimental feature, which is build. What do you want Rocky to build?
00:15:08.240 | The Colosseum. What? The Colosseum? Okay. I can't even spell that.
00:15:16.480 | Something that I can spell, please.
00:15:20.080 | A double-decker couch. Okay. I like that.
00:15:27.920 | Please come here. So yesterday I built a rocket ship, which is quite interesting. But a double-decker
00:15:35.200 | couch. So let's get Rocky out of the hole. So let's give it some space. Okay, Rocky.
00:15:42.800 | Please build -- can I just say couch? Double -- let's try it.
00:15:51.360 | Never do live. Double-decker couch. Please build double-decker couch.
00:15:57.200 | A. One. Just one. I'm just going to -- I'm just going to make it -- so what happens, Rocky?
00:16:03.760 | Then that prompt goes off. And so if I look at the code, which I've lost.
00:16:09.040 | Yeah. So there's the prompt. So what we've done, we've said to Rocky, "You're a Claude,
00:16:16.080 | an expert Minecraft builder creating by Anthropic. When given a structure description,
00:16:21.120 | which will be the input, then output valid JSON. So this JSON is how MindFlare builds objects. So you
00:16:27.760 | give that as part of the prompt so that the model will understand how to build Minecraft objects.
00:16:32.240 | Then you'll go strictly adhere to the following rules. Because if we didn't do that, it goes bananas
00:16:38.880 | and builds just nonsense. And so all blocks are placed to each other. These are the blocks you can use.
00:16:44.960 | And it responds with what it thinks a couch looks like. So this is the XY coordinates. And so it started
00:16:51.600 | the build. So let's see what Rocky's doing. And there, Rocky has built a double-decker couch. Thank you
00:17:00.400 | very much. I'm going to sit on the couch. So this obviously is the interpretation of what a double-decker
00:17:12.160 | couch looks like to Claude. And in the 3D space. And then it's been interpreted into XY coordinates.
00:17:18.960 | And Rocky has built it. So that is Rocky. Obviously we are here at the AWS booth all day. Could you please,
00:17:27.760 | I also put a QR code, I promised before. Let's fly to that.
00:17:34.800 | So scan the QR code. Please fill in the session feedback form. Also, if you fill it in,
00:17:42.560 | I'll give you AWS credit codes for everybody who fills it in. And a link to the GitHub website. So
00:17:47.920 | please do that. And hopefully you enjoyed the session. Hope hanging out with Rocky. And come ask me any
00:17:54.000 | questions at the booth when you can. Thank you very much. Thank you.