Back to Index

Claude plays Minecraft!


Transcript

Today I'm going to show you how I built an agent to play Minecraft, right? So this talk, in a nutshell, if I could sum it up, is all about agentic workflow. So I'm sure some of you here have built agents, you're in this track, maybe some of you have not.

So I'm going to start with a bit of a level set on agent workflow and agentic workflow, and then I'll show you how that maps to the agent we built with Minecraft. So this flow diagram is probably familiar to some of you, where an agent will take inputs like chat or some unstructured data.

It'll consume that and then it will use a set of tools, one or many tools to satisfy the request from the chat. So it can fulfill its destiny and its actions. And also it uses an LLM and it may then orchestrate one or many times through the LLM, the tools, and then return a response.

Pretty well understood now, but very cool. So if we take that into a Minecraft world, where does the chat come from? Well, if anybody here played Minecraft? A few people, a few people. Great. So if you've played Minecraft, you'll know there's a chat function where you can interact with the world and other people in the world.

And so that's where the chat comes for this agent. The tools in Minecraft are many, and that's why we built this. In Minecraft, you can do many things. You can build, you can dig, you can farm, you can slay a pig if you so desire, but you can do a few things.

And so the agent itself becomes the bot. So we tell it via chat what we want it to do. It uses the tools and hopefully fulfills our requests. Now, remembering that our requests are probably indeterminate and the output is also indeterminate. So today I'm going to live demo what you should never do.

So they say you should never live demo with kids, animals, and an LLM. So I'm going to do that today. The LLM itself, of course, is magic. It is. It's magic. And then the responses, the agent will do one or many thoughts and then return the response to Minecraft.

So that's the flow in a nutshell. And of course, we have our friendly bot. And so our bot's name is Rocky, in short for Bedrock. And we'll see what Rocky, if Rocky behaves today. And the tools we're going to use with Rocky and the actions that those tools will jump on is jump, move to position, locate a player, and a few more actions that we've built.

And at this stage, I'll also mention that this demo that I'm going to show is open source. And you can all have a play with it. At the end, all you need is a Minecraft client. So this is Rocky. I'm going to give you a quick recording that I did yesterday at the booth.

We're at the AWS booth. If you want to come and chat to me, I'm not scary. I am from Australia, but I'm not scary. So Rocky does a number of things. So this is what Rocky will look like. At the minute, it's raining in Rocky land. Rocky is a very friendly bot.

And so you can ask Rocky questions in the chat. I hope you can all see that. So what's the weather? So Rocky being friendly will know my name and then also tell you what the weather's like. So it's really in your area. You might want to take some shelter.

Rocky can also do some actions. So this is how we send the actions to the agent via chat. So Rocky jumps for us. Also, Rocky can do other things. So I'm going to hide behind this Rocky constructed building. And I'm going to ask Rocky, can he come and find me?

Or can they come and find me? Actually, Rocky is neither male nor female. Rocky says on my way. And there, there comes Rocky. And Rocky's found us. But what else can Rocky do? So what else did I do here? Oh, it can find things in the world. So Rocky here, I've asked it to find a pig.

So Rocky knows based on the plugin that we're using for Minecraft where everything is. And it says, yep, I found a pig. This is the location. And I then ask it, go find, go, go to that pig. So off Rocky runs to the pig and finds the pig. So we've made, we've demonstrated this at a lot of conferences.

And the biggest thing we've seen is people try to hit the pig. So I asked it to hit the pig. And there it goes. Rocky hits the pig. You could do that multiple times. And lots of people have observed do. Don't know why. But hey, human behavior is even more fascinating than LLMs.

So now I'm going to get Rocky to come here. And I'll probably finish with the dig action. So I'll ask Rocky to dig a hole. Notice when I say dig a hole, there's a parameter that I add. So I think in this instance, yeah, a two by two hole.

So this is a parameter that I'm sending to the action. I'll show you how all that works when I do a bit of a deep dive into the back end. So Rocky's digging a hole. And also Rocky can come find us out of that hole and dig their way out of the hole.

This is behavior that we didn't expect. And it just, just works, which is really fascinating. So that's, that's a bit of a three minute demo of Rocky. I'm going to live demo Rocky once I go through how I built it. So the architecture. So we built this for a serverless conference.

So our constraints where needs to run in a managed environment, a service environment, and be cool. So we started off with running it in the container. So Minecraft, you can run in the container. And you can also run mine flare, which is the bot framework that works with Minecraft.

And those two run side by side nicely on the container. We can't run mine flare on Lambda, which is serverless, because we need state, right? So the state is stored in mine flare. We started off with LangChain on Lambda to build this. Like most people, when they do start building agents, they probably start with LangChain.

I'm not a big Python developer. I am an engineer, but not really with Python. But we got it working to a state. But then as we added complexity and more and more actions and tools for Rocky to use, it got really, really complex. And then we also used Amazon SageMaker to host the LLM, which in this case, we were using the Cohere LLMs.

And then we decided, okay, so that's not service enough. Let's use agents for Amazon Bedrock, keep the Minecraft server and client, which is MindFlare on Amazon ECS, and build that agent up on Amazon Bedrock. And then for architecture. So what we're doing and what I'm demonstrating today, it's running on my local machine.

So both Minecraft and the MindFlare client. And then it's into calling out to agents for Amazon Bedrock and may the internet always be favorable over the next couple of minutes. So if you haven't come across agents for Amazon Bedrock, let me do a quick overview. Again, AWS booth is there.

Come talk to me. I'll go a deep dive for you. But in a nutshell, Amazon Bedrock is all about facading one or more models. So what it does is it produces a common API that then you can use as an engineer to build out your application and add features.

So we host as well as Amazon models, we host Anthropic models, we host Cahere models, as I mentioned, Lama models, and a whole host of other models. So when building out Rocky now using agents for Amazon Bedrock, you'll see that I used the Claude models. And I'll go through that as well.

Why I did that, in particular Claude Haiku, because it's fast. And so as well as hosting models, not hosting models, but providing a facade into models, everything, you can also build additions to that. And one of those additions is agents, as well as knowledge bases, where you can do RAG, and also guardrails and evals.

So all of these things are baked into Amazon Bedrock. And it's think of it more as sort of a managed agentic workflow, right? So you can manage RAG and manage agents as well. So it's all in one spot. So it also helps you with prompt creation. It's obviously, as it's an agent, orchestrates multiple tasks, and also allows you to trace through the chain of thought of the agent.

So you can either do that in the console, or we'll spit it out to logs. And of course, it also, and what was very key for this demo, it has return of control. Okay, so that's the slides. Let's jump into what we've got. So let me see if I can just, instead of doing that.

Mirror. Nope, not that. Oh, dear. Still mirroring. Okay, there's always, so. Me and display. Extend the display. Me and display, I want to mirror. That's not, that's not letting me. Apologies. You got this. You got this. I got this. And mirror. Mirror. There we go. Yay. Okay, so. Okay, so this is, that's MindFlare.

So it's all open source. Let's first of all start with the good stuff. So open source, MindFlare, you can use. And then this is the, what we've built. Now I'll share this at the end, QR code, everything. Don't worry about it. So this is Amazon Bedrock. So this is the agents, and this is the, the console page.

So what it'll do, you can go down here into agents. If you haven't seen Bedrock before, it'll show you the model access and what models you've got access to. But I'm really talking about agents, which is down here in the left. So here I've got a Minecraft agent. Apologies if you can't see this too well.

Let me see if I can make it a little bit better. You've got Minecraft, the Minecraft agent. So I've defined this in here. Now, for all the real engineers in the room, you don't have to build it by click ops. So you can build it all as infrastructure as code.

And if you check out the GitHub repo, you will see it all in CloudFormation and CDK. So once I've got into the agent, you can see that here I can select my model. So at the minute, this agent is using Cloud, Cloud3 Haiku. And here is the prompt. This is the system prompt for that agent.

You're a playful, friendly and creative Minecraft agent called Rocky. And your goal is to entertain players and collaborate with them in a fun gaming experience. And it goes on. So this prompt we built over time. There's a bit of prompt engineering going on in this demo. And then if I go into the actions, you can see all of the actions.

So you notice all of the actions that are defined all have returned control because we want them to return their output. So we've got jump. We've got other actions. Dig. Dig, you'll see this is where the parameters come in. So I've got a depth and a width. So you remember when I specified a hole, I said two by two.

Also, if I say small hole, it will go as a one by one hole. So I'll infer that using the model. Also, there's the action. Is it raining? So all of these actions are defined in here. And also, I've got another action set that I'm going to share with you.

I'm going to demo is called Minecraft experimental, right? And what this does, this has one action called build, which is a very complex thing to do in Minecraft, believe it or not, because it's a 3D space. And so the action for build, all I do is build a structure.

And that returns back to the client. And I'll show you how that prompts actually created when I do the build. So that's how it's built. And the actual source code itself is here. And the client is stopped. So I'm going to -- this is Rocky itself. So I'm going to start Rocky.

Going to start. And bot spawned. Going to go over to Minecraft. We've got running here. Back to game. Rocky's joined the game. There's Rocky. Let me just go set. Time. Time. It's time set. Time. Set. Noon. So you can see it better. Okay. Okay. So there's Rocky. So you can -- let's just test if Rocky's working.

So T, please come here. Rocky will make the first request. And there we go. So Rocky is working for us. So let's make sure that Rocky can do something. So we go, "Rocky, please, please dig a small, small hole." Okay. So Rocky will dig a small hole. So now I'm going to use the experimental feature, which is build.

What do you want Rocky to build? The Colosseum. What? The Colosseum? Okay. I can't even spell that. Something that I can spell, please. A double-decker couch. Okay. I like that. Please come here. So yesterday I built a rocket ship, which is quite interesting. But a double-decker couch. So let's get Rocky out of the hole.

So let's give it some space. Okay, Rocky. Please build -- can I just say couch? Double -- let's try it. Never do live. Double-decker couch. Please build double-decker couch. A. One. Just one. I'm just going to -- I'm just going to make it -- so what happens, Rocky? Then that prompt goes off.

And so if I look at the code, which I've lost. Yeah. So there's the prompt. So what we've done, we've said to Rocky, "You're a Claude, an expert Minecraft builder creating by Anthropic. When given a structure description, which will be the input, then output valid JSON. So this JSON is how MindFlare builds objects.

So you give that as part of the prompt so that the model will understand how to build Minecraft objects. Then you'll go strictly adhere to the following rules. Because if we didn't do that, it goes bananas and builds just nonsense. And so all blocks are placed to each other.

These are the blocks you can use. And it responds with what it thinks a couch looks like. So this is the XY coordinates. And so it started the build. So let's see what Rocky's doing. And there, Rocky has built a double-decker couch. Thank you very much. I'm going to sit on the couch.

So this obviously is the interpretation of what a double-decker couch looks like to Claude. And in the 3D space. And then it's been interpreted into XY coordinates. And Rocky has built it. So that is Rocky. Obviously we are here at the AWS booth all day. Could you please, I also put a QR code, I promised before.

Let's fly to that. So scan the QR code. Please fill in the session feedback form. Also, if you fill it in, I'll give you AWS credit codes for everybody who fills it in. And a link to the GitHub website. So please do that. And hopefully you enjoyed the session.

Hope hanging out with Rocky. And come ask me any questions at the booth when you can. Thank you very much. Thank you.