back to index

Ship it! Building Production Ready Agents — Mike Chambers, AWS


Whisper Transcript | Transcript Only Page

00:00:00.040 | and yeah so my name is Mike Chambers I'm gonna pick you up a little bit there so I'm from
00:00:18.180 | Queensland in the eastern part of Australia but that's okay yeah very happy to be here so I'm a
00:00:23.940 | developer advocate for Amazon Web Services and I completely and utterly and
00:00:28.080 | only and totally specialize in generative AI used to be machine learning now it's
00:00:33.780 | generative AI I'll be talking about why this slide is up here in a moment any
00:00:37.860 | tabletop RPG players in the room there's got to be at least one or two there are
00:00:42.120 | some Randall you were saying you were before okay it doesn't matter if you
00:00:45.720 | don't get that but I will be showing some code and what I'm going to be showing is
00:00:49.560 | how to get the code and how to get into production so if you're a developer this
00:00:53.400 | is how you can get your code and get it to cloud scale if you're a leader then
00:00:56.940 | this is how you can get your developers to get their code and get it into cloud
00:01:00.660 | scale just a little bit about the kinds of things that I've done in the past and so
00:01:05.160 | I was incredibly fortunate a couple of years ago to work with dr. Dr. Andruin and
00:01:10.740 | some colleagues from AWS on this course this is the fundamentals of LLM's over
00:01:16.680 | 370,000 people have taken this course so far it was the first of its course of its kind I also
00:01:22.860 | got to play Transformers with Andrew which is cool and if you get the reference there
00:01:27.860 | I'm playing with Transformers okay and if you don't know who this is that's Optimus Prime
00:01:32.700 | okay I think I'm holding I can't what's the other one called anybody Megatron thank you yes yes that's
00:01:42.780 | right and so let me jump into it let me jump straight into it and some of you saw that code
00:01:48.600 | was coming and code is coming now everything I'm going to show is pre-recorded because Wi-Fi right
00:01:53.520 | we know that but everything that I'm showing also is in real time so what I want to show
00:01:58.620 | first of all is just something super super simple this is a single Python function a
00:02:03.360 | single Python file which contains an agent right and it's the agent I'm going to use for the
00:02:08.600 | demonstration moving forward this is where we get to dice rolling and TTRPG and the idea here is
00:02:15.480 | that I want to do a simple agent with a simple tool that's not getting the weather because we
00:02:19.740 | know how to do that so in this code here it doesn't use any frameworks or anything at all it's actually
00:02:25.740 | kind of inefficient but I have this one what we'd call a tool here this is my tool it's how to roll
00:02:32.760 | the dice essentially it's a random number generator but I'm calling it a dice roller and you can see
00:02:39.300 | here if you if you're familiar with this it's using a llama and it's using the the llama 3.1 8 billion
00:02:47.980 | parameter model super tiny just running on my laptop this is literally an agent running on my machine and so
00:02:54.960 | if we skip through here I've got some things which I've sort of collapsed down for brevity this is a
00:03:00.300 | simple agent it's an agentic flow we're going to talk more in detail about what agents are in a moment I
00:03:05.220 | have my system prompt where I tell it what tools it's got available I then also actually do include some
00:03:12.240 | examples as well the reason I include examples again it's a tiny little model and it just needs a
00:03:17.100 | little bit of a helping hand to understand how to use these tools for example to roll a d20 a d20 this
00:03:24.600 | is going to come up again in this session so I'll tell you now it's a 20 sided dice and if you want to get one we
00:03:30.120 | have them on the AWS stand come and see us afterwards you can have an IRL d20 but we're going to have an agent which
00:03:36.120 | will roll it for us so let's just open up the terminal I'm just going to run this locally so you can see the kinds of
00:03:42.660 | things that my agent is going to do and so I'm just running my Python code here and you can see it's
00:03:48.540 | waiting for my prompt so I'm going to wait for my input I'm going to tell it and okay this is some
00:03:53.400 | terminology here from gameplay and roll for initiative and add a dexterity modifier of five that means
00:04:00.120 | something to the few people that put their hands up in the room and but it basically means that what
00:04:05.100 | I've done is I passed in my natural language into the agent it's used its large language model and its
00:04:11.520 | language understanding to figure out what I meant by that it's then looked at the tools that it's got
00:04:16.740 | available the tool it has available is to roll the dice and then it's performed that action and then
00:04:21.960 | we've got an outcome here so it's rolled a d20 and it's come back with a random number of 10 it's added
00:04:28.320 | five to it I've got my output of 15 so this is a simple really simple example of what an agent and how an
00:04:36.480 | agent's structured and what an agent can do I know it's just a random number generator but I think
00:04:41.700 | it's pretty exciting so I want to take this code though which is all running on my machine at really low
00:04:47.880 | scale and I want to make this cloud scale I want to get this production ready so that's what I'm going to
00:04:54.540 | look at next so let's yes let's get back into slides just for one moment and I want to talk about the
00:05:01.560 | anatomy of an agent we've just looked at it running there but let's look at the anatomy of an agent and as
00:05:06.360 | we're doing this each one of these things we actually have to get into the cloud running at cloud scale so that this is the
00:05:13.260 | components that we need to scale and so the first thing I think it's probably the obvious thing is the
00:05:18.780 | model this is actually super simple and super easy for us to do as in they exist already so we can just go
00:05:24.600 | ahead and use those but this is our natural language understanding of course we need this to be able to run our
00:05:29.980 | agent and the next thing that I've got on my list this is my list there's probably other things you could add to
00:05:35.340 | this but I think this is a minimum list that you need for an agent you need a prompt so you need
00:05:39.840 | something to explain to the agent why it exists in the world and what kinds of things it can do give
00:05:45.300 | it a personality for example next a loop so this is the agentic loop so this is the ability for the agent to be
00:05:54.300 | able to think essentially so it it looks at the input and it can then process that input it then needs to go
00:06:01.020 | and use a tool but then it needs to evaluate whether that tool actually answered the question or not
00:06:05.280 | it needs to figure out if it needs to go and run another tool it needs to loop around it really is nothing much more than a
00:06:11.260 | while statement if while whatever with some strings flowing around a lot of this stuff is just string manipulation
00:06:17.260 | but we need that to be hosted and and cloud scale next up history this is a really important one actually and
00:06:25.440 | I've had a couple of deep conversations about this recently and when I say history and I talk about conversational history
00:06:31.120 | people think of that as like okay well I'm asking my agent to do something today and
00:06:35.220 | and tomorrow I want it to remember what I asked it yesterday it's actually a little bit more deeper than that the actual
00:06:41.200 | conversational history inside of an agent is is crucial to the running of the agent so when we ask it that question
00:06:47.200 | and such as rolling you know rolling for initiative or whatever it might be and that goes in and then it does some
00:06:54.200 | reasoning steps right it decides what it wants to do and then it goes and calls a tool and the results the stuff that we were talking about before in the loop
00:07:02.180 | but it needs to remember those things that it's done so that it's each next step it can do that within the context of what it's done before
00:07:11.180 | so history is and conversational history at a low level is actually really important
00:07:17.160 | and then finally I'm gonna have tools here so these are of course how our agent has agency to be able to go and perform something in the outside world
00:07:27.140 | so another important part are there other components of agents probably yes yes there absolutely are but this I think is the base fundamental
00:07:35.120 | is you need these things in order to be able to have a minimal viable product I guess of an agent so I'm here from AWS let me tell you about what AWS can do to host these agents at cloud scale
00:07:48.120 | the first one I mentioned it before as being easy obviously anybody who works for a model of a laboratory would say actually making agents making models is actually kind of tricky
00:07:58.100 | but they've done it for us right so the models exist we've got anthropic we've got Amazon Nova model and this icon here represents Amazon bedrock
00:08:07.100 | so bedrock is a suite of capabilities that allow us to be able to build generative components into the applications that we're building
00:08:15.100 | so we can just take these components slot them in and build something we have models from a number of different leading providers Amazon like I said anthropic
00:08:24.380 | meta mistrail and AI 21 labs we have a number of different models available you can plug them into your system so that's the kind of the easy bit
00:08:32.720 | let's now build up the actual agent itself and so for that I have Amazon bedrock agents and Amazon bedrock agencies fully managed there's no infrastructure to manage with this
00:08:43.920 | you can just put your configuration in and it will be cloud scale
00:08:47.200 | so the answer to the question I'm here to answer is actually really straightforward we're just going to go and use this service
00:08:52.460 | now inside of that we have the configuration that we're going to have for our agent
00:08:57.540 | now I have put here instruction instruction is the terminology we use this is again the personality of the agent it's a bit like the prompt
00:09:06.380 | it's not quite the prompt because it gets combined with a prompt template to be the actual prompt but this essentially for all intents and purposes is our prompt
00:09:14.920 | that prompt template by the way if you're interested you can absolutely go and edit that if you want but you get delivered a default prompt template with the model that you've selected
00:09:24.460 | and then we need to have all of those things I talked about within the loop and the history of conversational history it's all taken care of so that's all just inside of the service but then from our configuration standpoint the next thing is the action group
00:09:38.860 | so this is where we connect into tools so an action group is a collection of tools that's what you can see that as I've got a couple of different action groups here just to show that you can have more than one
00:09:48.860 | we only need one we only need one because I only have one dice rolling tool and and you see in the middle here we've got lambda so hopefully you're familiar with lambda it's a function as a service it scales super well it handles that scaling for you it's just the way the service works and it's a perfect place to host these kinds of tools and therefore your tools can do anything that lambda can do which is I mean I'd be interested if you can find a use case that lambda can't do there are some but any
00:10:18.840 | anything anything anything that your code can do in there your tool can do including reaching out to the outside world reaching to other AWS services sending an email launching a rocket whatever it is that you want to do actually I'd really like to see launching a rocket so I say that from time to time we need to go do that we need to build that so I've whistle stop toward through those components the important things to remember there are the action groups really because that's maybe terminology you haven't come across before and so now I'm going
00:10:48.820 | we're going to dive into the console and we're going to build this again this is all real time I'm advancing it forward with my clicker right but but in between those things it's all real time there are some people in a room that are now saying well hang on a minute you were saying ship this to production I'm not going to ship to production with click ops that's awful and you're right it is and so all the things that I'm going to show you are all possible as well with Terraform or Pulumi or CloudFormation or SDK or SAM or whatever infrastructure is code framework that you want to use
00:11:18.800 | so bear that in mind but I'm showing you in the console because it's easier to look at all right so what I need to do is just I'm at the Amazon bedrock part of the AWS console page I've gone down to agents on the bottom of the left or the down on the left hand side and this next little part is going to be very bouncy ball kind of easy stuff I'm going to click create an agent
00:11:41.800 | okay so then I get to name it and I get to give it a description so that six months from now when I go into my console I'm like what was that agent for is that oh that's what that agents for so it's always a good idea to use descriptions in anything you do inside of AWS so I'm saying it's an agent to help me play tabletop RPG
00:11:59.800 | if those are the questions you have by the way we definitely go to the booth afterwards we can talk a lot about this kind of stuff so I've got that set up and I scroll down you remember one of the pieces of configuration that we needed was to connect it to our model and so I get to choose models here I've got the Amazon Nova models
00:12:17.800 | models I'm going to select the anthropic haiku 3.5 3.5 haiku it's a small it's a capable model it's fast it's probably more than we need for this very simplistic agent but it's a good one and so here I'm adding in those instructions if you remember before the instructions get combined with the template to become the actual prompt so you're a games master who can help me play tabletop RPG games that's probably what you're going to take away from this right everybody's going to go away and start playing
00:12:47.780 | these games and so I've defined what I want my agent to be so I click Save so that I've got it ready to go and the next thing I have to do is action groups so once we've done that yes we go we scroll down to action groups and this is where I'm going to connect in my logic with the code execution that we were talking about before so let's click add guess what it wants a name and a description for familiar territory here and so we can give it this description now these descriptions are red
00:13:17.760 | by the large language model so that we can figure out what this tool is for and this is sort of this new paradigm of programming right so in terms of definitions and schemas for how these things work we actually write in natural language in here so that LLM can read it and figure out if that's what they want to do or what that's what they want to use so we've got our action group as a whole now let's go and add in the actual code
00:13:40.760 | code we have a few different options here and this option here is recommended one which is to use lambda which we saw in that diagram before also with this quick start here it'll set up the lambda function for you it'll do all the permissions for you and it will hook it up to the agent for you so it's a really quick way to get started definitely recommend using that
00:14:01.760 | and now we come down to define the actual tool itself that we want this is my dice roll or my roll dice I flip between the two this one's going to be called roll dice and again we're giving
00:14:13.760 | it a description this description is important because it tells the LLM what what this does and whether it wants
00:14:19.760 | to use it or not and so it's got roll the dice with a certain number of sides so we need a way to pass in those certain number of sides this was all in the code before as well
00:14:29.760 | and so if I scroll down here I get the parameters and I get to again just describe these parameters that I want just expanding this out because I'm zoomed in for the
00:14:37.760 | screen's sake and I get to add in my parameter and guess what I give it a name and I give it a description and again the description is something which is read by the large language
00:14:47.760 | model we're following the pattern here right so it knows what this is for and I'm going to set this up as an integer value because I only have a
00:14:55.760 | an integer number of sides on any dice and I'm going to say it's required I need this to be there I need to tell the model that if you're using this
00:15:03.760 | tool you must provide me with an integer for that okay I know I'm going fast but I'm respectful of your time and I want to show you this
00:15:11.760 | working so let's click create again this is real time it's going to go ahead and build that lambda function for me integrate the permissions for me and have everything set up and so once I save this and scroll down
00:15:23.760 | I can get back to my action group dive into my action group again and what I have inside of there is now a link to the lambda function
00:15:31.760 | so codes coming back and we just need to put this logic into our lambda function so it can work for us now I'm going to do a little bit of quick and dirty here I'm going to take the boilerplate that it provided for me and I'm just going to whack in some function to allow us to do this dice roll
00:15:45.760 | and so here's my Python inside my lambda function so this is just inside the console you'll notice that as I'm coding here Amazon Q developer is actually built into the console into this code editor and it'll be making suggestions for me in terms of what code I might want to put in so I'm going to import random which of course is necessary if you're familiar with lambda already then you'll know that you get this event that comes in and that event sort of triggers how this thing is going to work so I'm going to import random which of course is necessary if you're familiar with lambda already then you'll know that you get this event that comes in and that event sort of triggers how this thing is going to work
00:16:13.760 | and we have here a function which is passed in so it's telling us which function we want to use and it sends in some parameters as well so let's rattle through putting some code in I'm going to say okay well if the function that we're calling is roll dice then I want to basically go and grab the number of sides generate my random number and off I go so with it being real time you have to wait for me to actually type which is the way it is and you also have to wait for me
00:16:43.740 | so I need to paste this one because I'm less good at doing this one by hand but then it comes and we're nearly there so the next thing I need to do is just format my response so my response body here is going to have my random number generated in this particular case Q developer inside of this IDE knows exactly what I want to do by this point because I know you you want to play RPG so it's written the code for me so I can just tab select
00:17:13.720 | tab complete on that and then with a little bit of tidy up just so that it works with the boilerplate that was there just going to indent that and then I can go ahead and click the deploy button
00:17:23.920 | and that's all you need to do so that's all you need to do everything should be ready to go so if I go back to my agent I'm going to have to prepare my agent I just want to talk through this just for one moment and this is a production ready environment right so we have the agent and then we have alias IDs in here we've got the whole software development lifecycle
00:17:53.900 | thing going on so you can have different aliases as you're publishing out your agents but once we've prepared that we should be able to test it over on the right hand side and so let's ask it the question I'm going to be a little bit more flowery language in this case because we want to stretch the model a tiny bit we're not really stretching anything at all and as soon as we've got this and I figured out how to spell initiative
00:18:18.580 | it will roll the dice and it will roll the dice for us and as you might expect it will roll the dice for us and as you might expect it's going to work but what's happening here is that this agent is now fully hosted in a fully managed environment it's going to work for us at cloud scale and our answer comes back with a quick 15.
00:18:34.580 | I just want to go and wrap this up then and if you're interested in learning more about this so I mentioned before about the courses we have on deep learning AI these courses are totally free you get a free AWS environment to be able to play around with Amazon bedrock agents completely risk free I'll put this QR code back up in a moment as well thank you so much so please come and talk to me following my prompt template if you want to talk about anything more than just this at cloud scale MCP servers
00:19:04.560 | at cloud scale I definitely want to talk to you if you want to talk about our new open source SDK for developing model first agents this is the stuff I hadn't got time to fit into this presentation if you want an in real life D20 then come and join me on the AWS stand in the expo hall if you want to talk about anything else then thank you so much for your time and I will let you get to lunch
00:19:34.540 | Thank you.