back to indexClaude plays Minecraft!

00:00:00.000 |
Today I'm going to show you how I built an agent to play Minecraft, right? So this talk, 00:00:22.440 |
in a nutshell, if I could sum it up, is all about agentic workflow. So I'm sure some of you here have 00:00:29.760 |
built agents, you're in this track, maybe some of you have not. So I'm going to start 00:00:35.040 |
with a bit of a level set on agent workflow and agentic workflow, and then I'll show you 00:00:41.520 |
how that maps to the agent we built with Minecraft. So this flow diagram is probably familiar to 00:00:50.720 |
some of you, where an agent will take inputs like chat or some unstructured data. It'll 00:00:59.520 |
consume that and then it will use a set of tools, one or many tools to satisfy the request 00:01:06.000 |
from the chat. So it can fulfill its destiny and its actions. And also it uses an LLM and it may 00:01:15.600 |
then orchestrate one or many times through the LLM, the tools, and then return a response. 00:01:22.800 |
Pretty well understood now, but very cool. So if we take that into a Minecraft world, where does the chat 00:01:32.320 |
come from? Well, if anybody here played Minecraft? A few people, a few people. Great. So if you've 00:01:39.280 |
played Minecraft, you'll know there's a chat function where you can interact with the world and other 00:01:45.360 |
people in the world. And so that's where the chat comes for this agent. The tools in Minecraft are many, 00:01:53.920 |
and that's why we built this. In Minecraft, you can do many things. You can build, you can dig, 00:02:01.040 |
you can farm, you can slay a pig if you so desire, but you can do a few things. 00:02:11.840 |
And so the agent itself becomes the bot. So we tell it via chat what we want it to do. It uses the tools 00:02:20.800 |
and hopefully fulfills our requests. Now, remembering that our requests are probably indeterminate 00:02:28.240 |
and the output is also indeterminate. So today I'm going to live demo what you should never do. So they 00:02:35.040 |
say you should never live demo with kids, animals, and an LLM. So I'm going to do that today. 00:02:40.640 |
The LLM itself, of course, is magic. It is. It's magic. And then the responses, the agent will 00:02:51.120 |
do one or many thoughts and then return the response to Minecraft. So that's the flow in a nutshell. 00:03:00.960 |
And of course, we have our friendly bot. And so our bot's name is Rocky, in short for Bedrock. And we'll 00:03:10.400 |
see what Rocky, if Rocky behaves today. And the tools we're going to use with Rocky and the actions that 00:03:18.160 |
those tools will jump on is jump, move to position, locate a player, and a few more actions that we've built. 00:03:28.080 |
And at this stage, I'll also mention that this demo that I'm going to show is open source. 00:03:33.520 |
And you can all have a play with it. At the end, all you need is a Minecraft client. 00:03:38.080 |
So this is Rocky. I'm going to give you a quick recording that I did yesterday at the booth. We're 00:03:45.120 |
at the AWS booth. If you want to come and chat to me, I'm not scary. I am from Australia, but I'm not 00:03:51.600 |
scary. So Rocky does a number of things. So this is what Rocky will look like. At the minute, it's 00:03:57.920 |
raining in Rocky land. Rocky is a very friendly bot. And so you can ask Rocky questions in the chat. I hope 00:04:04.080 |
you can all see that. So what's the weather? So Rocky being friendly will know my name and then also tell 00:04:12.720 |
you what the weather's like. So it's really in your area. You might want to take some shelter. 00:04:16.960 |
Rocky can also do some actions. So this is how we send the actions to the agent via chat. 00:04:23.200 |
So Rocky jumps for us. Also, Rocky can do other things. So I'm going to hide behind this 00:04:30.400 |
Rocky constructed building. And I'm going to ask Rocky, can he come and find me? Or can they come and find me? 00:04:36.880 |
Actually, Rocky is neither male nor female. Rocky says on my way. And there, there comes Rocky. 00:04:44.160 |
And Rocky's found us. But what else can Rocky do? So what else did I do here? Oh, it can find things 00:04:54.160 |
in the world. So Rocky here, I've asked it to find a pig. So Rocky knows based on the plugin that we're 00:05:02.560 |
using for Minecraft where everything is. And it says, yep, I found a pig. This is the location. 00:05:07.760 |
And I then ask it, go find, go, go to that pig. So off Rocky runs to the pig and finds the pig. 00:05:14.640 |
So we've made, we've demonstrated this at a lot of conferences. And the biggest thing we've seen 00:05:21.120 |
is people try to hit the pig. So I asked it to hit the pig. And there it goes. Rocky hits the pig. 00:05:28.800 |
You could do that multiple times. And lots of people have observed do. Don't know why. But hey, 00:05:33.120 |
human behavior is even more fascinating than LLMs. So now I'm going to get Rocky to come here. And I'll 00:05:40.880 |
probably finish with the dig action. So I'll ask Rocky to dig a hole. Notice when I say dig a hole, 00:05:50.080 |
there's a parameter that I add. So I think in this instance, yeah, a two by two hole. So this is a 00:05:56.560 |
parameter that I'm sending to the action. I'll show you how all that works when I do a bit of a deep 00:06:01.600 |
dive into the back end. So Rocky's digging a hole. And also Rocky can come find us out of that hole and 00:06:11.360 |
dig their way out of the hole. This is behavior that we didn't expect. And it just, just works, 00:06:17.680 |
which is really fascinating. So that's, that's a bit of a three minute demo of Rocky. I'm going to live 00:06:24.960 |
demo Rocky once I go through how I built it. So the architecture. So we built this for a serverless 00:06:31.360 |
conference. So our constraints where needs to run in a managed environment, a service environment, 00:06:39.600 |
and be cool. So we started off with running it in the container. So Minecraft, you can run in the 00:06:46.480 |
container. And you can also run mine flare, which is the bot framework that works with Minecraft. And 00:06:53.040 |
those two run side by side nicely on the container. We can't run mine flare on Lambda, which is serverless, 00:07:00.240 |
because we need state, right? So the state is stored in mine flare. 00:07:04.080 |
We started off with LangChain on Lambda to build this. Like most people, when they do start building 00:07:10.720 |
agents, they probably start with LangChain. I'm not a big Python developer. I am an engineer, but not 00:07:16.720 |
really with Python. But we got it working to a state. But then as we added complexity and more and more 00:07:23.360 |
actions and tools for Rocky to use, it got really, really complex. And then we also used 00:07:29.920 |
Amazon SageMaker to host the LLM, which in this case, we were using the Cohere LLMs. 00:07:36.240 |
And then we decided, okay, so that's not service enough. Let's use agents for Amazon Bedrock, 00:07:46.160 |
keep the Minecraft server and client, which is MindFlare on Amazon ECS, and build that agent up on Amazon Bedrock. 00:07:55.120 |
And then for architecture. So what we're doing and what I'm demonstrating today, 00:08:01.440 |
it's running on my local machine. So both Minecraft and the MindFlare client. And then it's into calling out to 00:08:10.240 |
agents for Amazon Bedrock and may the internet always be favorable over the next couple of minutes. 00:08:17.520 |
So if you haven't come across agents for Amazon Bedrock, let me do a quick 00:08:21.440 |
overview. Again, AWS booth is there. Come talk to me. I'll go a deep dive for you. 00:08:26.240 |
But in a nutshell, Amazon Bedrock is all about facading one or more models. So what it does is it produces a 00:08:35.120 |
common API that then you can use as an engineer to build out your application and add features. So 00:08:42.480 |
we host as well as Amazon models, we host Anthropic models, we host Cahere models, as I mentioned, 00:08:48.640 |
Lama models, and a whole host of other models. So when building out Rocky now using agents for Amazon 00:08:56.800 |
Bedrock, you'll see that I used the Claude models. And I'll go through that as well. 00:09:03.280 |
Why I did that, in particular Claude Haiku, because it's fast. And so as well as hosting models, not hosting models, 00:09:11.680 |
but providing a facade into models, everything, you can also build additions to that. And one of those additions 00:09:19.760 |
is agents, as well as knowledge bases, where you can do RAG, and also guardrails and evals. So all of these things are 00:09:27.280 |
baked into Amazon Bedrock. And it's think of it more as sort of a managed agentic workflow, right? So 00:09:35.840 |
you can manage RAG and manage agents as well. So it's all in one spot. So it also helps you with prompt 00:09:43.520 |
creation. It's obviously, as it's an agent, orchestrates multiple tasks, and also allows you to trace 00:09:49.840 |
through the chain of thought of the agent. So you can either do that in the console, or we'll spit it 00:09:54.560 |
out to logs. And of course, it also, and what was very key for this demo, it has return of control. 00:10:02.800 |
Okay, so that's the slides. Let's jump into what we've got. 00:10:06.720 |
So let me see if I can just, instead of doing that. 00:10:27.440 |
Oh, dear. Still mirroring. Okay, there's always, so. 00:10:37.680 |
Me and display. Extend the display. Me and display, I want to mirror. That's not, that's not letting me. 00:10:54.080 |
Apologies. You got this. You got this. I got this. 00:11:13.280 |
Okay, so this is, that's MindFlare. So it's all open source. Let's first of all start with the good stuff. 00:11:21.760 |
So open source, MindFlare, you can use. And then this is the, what we've built. Now I'll share this 00:11:29.840 |
at the end, QR code, everything. Don't worry about it. So this is Amazon Bedrock. So this is the agents, 00:11:35.600 |
and this is the, the console page. So what it'll do, you can go down here into agents. If you haven't seen 00:11:44.320 |
Bedrock before, it'll show you the model access and what models you've got access to. But I'm really 00:11:48.960 |
talking about agents, which is down here in the left. So here I've got a Minecraft agent. Apologies 00:11:53.760 |
if you can't see this too well. Let me see if I can make it a little bit better. You've got Minecraft, 00:11:59.280 |
the Minecraft agent. So I've defined this in here. Now, for all the real engineers in the room, 00:12:04.240 |
you don't have to build it by click ops. So you can build it all as infrastructure as code. And if you 00:12:08.640 |
check out the GitHub repo, you will see it all in CloudFormation and CDK. So once I've got into the agent, 00:12:14.800 |
you can see that here I can select my model. So at the minute, this agent is using Cloud, 00:12:20.000 |
Cloud3 Haiku. And here is the prompt. This is the system prompt for that agent. You're a playful, 00:12:25.200 |
friendly and creative Minecraft agent called Rocky. And your goal is to entertain players and collaborate 00:12:30.640 |
with them in a fun gaming experience. And it goes on. So this prompt we built over time. There's a bit 00:12:36.160 |
of prompt engineering going on in this demo. And then if I go into the actions, you can see all of 00:12:42.720 |
the actions. So you notice all of the actions that are defined all have returned control because we want 00:12:49.120 |
them to return their output. So we've got jump. We've got other actions. Dig. Dig, you'll see this is 00:12:58.000 |
where the parameters come in. So I've got a depth and a width. So you remember when I specified a hole, 00:13:03.360 |
I said two by two. Also, if I say small hole, it will go as a one by one hole. So I'll infer that 00:13:08.720 |
using the model. Also, there's the action. Is it raining? So all of these actions are defined in here. 00:13:16.560 |
And also, I've got another action set that I'm going to share with you. I'm going to demo is called 00:13:27.360 |
Minecraft experimental, right? And what this does, this has one action called build, which is a very 00:13:35.200 |
complex thing to do in Minecraft, believe it or not, because it's a 3D space. And so the action for build, 00:13:42.080 |
all I do is build a structure. And that returns back to the client. And I'll show you how that prompts 00:13:49.200 |
actually created when I do the build. So that's how it's built. And the actual source code itself is here. 00:13:56.960 |
And the client is stopped. So I'm going to -- this is Rocky itself. So I'm going to start Rocky. 00:14:03.920 |
Going to start. And bot spawned. Going to go over to Minecraft. We've got running here. 00:14:14.400 |
Back to game. Rocky's joined the game. There's Rocky. Let me just go set. Time. Time. It's time set. 00:14:26.640 |
Okay. Okay. So there's Rocky. So you can -- let's just test if Rocky's working. So T, please come here. 00:14:41.440 |
Rocky will make the first request. And there we go. So Rocky is working for us. So let's make sure that 00:14:48.880 |
Rocky can do something. So we go, "Rocky, please, please dig a small, small hole." Okay. So Rocky will dig a small hole. 00:15:02.880 |
So now I'm going to use the experimental feature, which is build. What do you want Rocky to build? 00:15:08.240 |
The Colosseum. What? The Colosseum? Okay. I can't even spell that. 00:15:27.920 |
Please come here. So yesterday I built a rocket ship, which is quite interesting. But a double-decker 00:15:35.200 |
couch. So let's get Rocky out of the hole. So let's give it some space. Okay, Rocky. 00:15:42.800 |
Please build -- can I just say couch? Double -- let's try it. 00:15:51.360 |
Never do live. Double-decker couch. Please build double-decker couch. 00:15:57.200 |
A. One. Just one. I'm just going to -- I'm just going to make it -- so what happens, Rocky? 00:16:03.760 |
Then that prompt goes off. And so if I look at the code, which I've lost. 00:16:09.040 |
Yeah. So there's the prompt. So what we've done, we've said to Rocky, "You're a Claude, 00:16:16.080 |
an expert Minecraft builder creating by Anthropic. When given a structure description, 00:16:21.120 |
which will be the input, then output valid JSON. So this JSON is how MindFlare builds objects. So you 00:16:27.760 |
give that as part of the prompt so that the model will understand how to build Minecraft objects. 00:16:32.240 |
Then you'll go strictly adhere to the following rules. Because if we didn't do that, it goes bananas 00:16:38.880 |
and builds just nonsense. And so all blocks are placed to each other. These are the blocks you can use. 00:16:44.960 |
And it responds with what it thinks a couch looks like. So this is the XY coordinates. And so it started 00:16:51.600 |
the build. So let's see what Rocky's doing. And there, Rocky has built a double-decker couch. Thank you 00:17:00.400 |
very much. I'm going to sit on the couch. So this obviously is the interpretation of what a double-decker 00:17:12.160 |
couch looks like to Claude. And in the 3D space. And then it's been interpreted into XY coordinates. 00:17:18.960 |
And Rocky has built it. So that is Rocky. Obviously we are here at the AWS booth all day. Could you please, 00:17:27.760 |
I also put a QR code, I promised before. Let's fly to that. 00:17:34.800 |
So scan the QR code. Please fill in the session feedback form. Also, if you fill it in, 00:17:42.560 |
I'll give you AWS credit codes for everybody who fills it in. And a link to the GitHub website. So 00:17:47.920 |
please do that. And hopefully you enjoyed the session. Hope hanging out with Rocky. And come ask me any 00:17:54.000 |
questions at the booth when you can. Thank you very much. Thank you.