New LangChain XML Agents

Today we're going to be taking a look at one of the alternative agent types that you can use in the lang chain and specifically we'd want to use this agent with anthropic LLMs. So we're going to be taking a look at this agent it's called the XML agent and we're going to see how to use it with some simple tools.

We're going to be adding in a rag pipeline using pyngon serverless and we're also going to be using coheres embed v3 embeddings so three relatively new models and services there so that will be pretty interesting and we'll see what we get. So we're going to start here on the agent types page on in the lang chain docs so we come down to here and we can see some information okay so you have like the openai ones here and then we have xml right so this is one we're going to focus on and it literally says if you're using anthropic models or other models good at xml so maybe they have a example of xml somewhere which yes would look like this so you can see that the different format that xml uses is literally like xml with the sort of html like tags you have the tool name here so that's like the action that you would get within the json that you'd pass through a react agent you have the tool input that's the action input and this would be the response that the agent will get right then it must answer like this so it passes the final answer tags like so it's a little more compact than the react approach and obviously with models that have been trained to use this it's going to work better it's going to be more reliable which is important so yeah it's a good thing to use especially if you are using anthropic models so for the example notebook i'm in the pinecone examples learn generation line chain i've started a new directory for the v1 stuff because we're using line chain v1 here which again that's another kind of new thing so we're going to go to xml agents and we're going to go to open in colab cool so we're here i'm going to connect and yeah we can start going through it so these are the versions that we're using so we have line chain long chain community line chain hub where we're going to get the prompt from for this model anthropic as mentioned cohere pinecone client and hugging face data sets a few libraries here but all pretty lightweight so okay okay cool uh i think this is probably fine and here we are so oh this is another new thing so we also have the ai archive 2 data set now if you've been watching a few of my videos you will have seen me using the ai archive data set this is a new one uh so if i go over here i just came off the notebook but if you come over here you can kind of see what that looks like we have a lot more data in there now and that's also growing i literally have the process running right now pulling in more archive papers and yeah it's a lot cleaner the the text that's been processed using a the yolo x model and the unstructured library for that that's why it takes so long to actually it's been processing this for like a week uh but yeah it's a much better quality which is great so that's exciting and yeah let's uh so let's start here we'll get this little warning we can ignore it it's not they just want us they basically want us to use the token but we don't need the token for this data set so sorry hugging face now while we're waiting for that we can go get our first api key we need a couple so let's go over to cohere so it's dashboard dot cohere dot com go over to api keys and you create an api key so just copy your api key and we're gonna get to where you enter in a moment so we've just downloaded just 20 000 rows here we don't need tons for this example and obviously it costs more and it's gonna take a while if we download the full data set as well so i'm just taking the first 20 000 there then we're gonna take the cohere api key so that's one we just got we're gonna enter that in this little nice little text box again great and then we can initialize the coherent beddings that we're going to be using so we have coherent beddings we're using the embed english v3 model so let's run that okay and then pinecone api key so we go app.pinecone.io and we'll end up here well you'll probably have like a default project so i already have my xml agents example there so i will i'm gonna leave it so that i don't need to wait for it to recreate everything but mainly i want to go to api keys here and i'm going to copy this okay and i'm going to run this out and again i'm going to enter my api key okay cool so that looks good i am using pinecone serverless here so i would recommend doing the same it's what one you get i think it's a hundred dollars right now obviously i don't know when you're watching this but as i recall this you get a hundred dollars uh free credits there and i know that very soon there will also be a free tier for serverless and it's just when when you do if you come to paying if you ever do i mean you have a hundred dollars so probably not for a while anyway uh you it's like nothing it's crazy cheap so anyway uh yes so this is how we would create an embedding maybe i should have put that further up but fine so this is using coherent model we're embedding documents or i'm just embedding hello and yeah you get this dimensional vector out of it that is the dimensionality of the coherent embedding model the reason i'm showing you that is because we need to use this just here when we're initializing our index now yeah we pass in our serverless spec if you wanted to use pods you would swap that for a pod spec and this is the index name you saw it in my dashboard a moment ago and yeah let's we can run that with the metric interestingly for the coherent models the embedded v3 models anyway you can actually use you can use euclidean cosine or dot product apparently they all give the same the same similarity which is it is kind of cool i don't know how exactly that is possible but that's cool interesting so yeah uh we word okay right now or when you're on this you should probably see zero for your total vector count it's because i already have the index and then after that you would just create your index like this the id we can actually actually do that because we have unique ideas in here now where is it we have this so yeah just a little quick fix there and yeah you run that uh last time i did it was 11 minutes so it's a little bit of time not too significant now while that is running for you let's jump over to grabbing our anthropic api keys so this one's always a little hard to find at least for me i always find it hard to find so you have to go to console anthropic.com so console.anthropic.com and you have to create an account if you don't already okay cool so you should get logged in i i'm gonna go to get api keys and yeah you can go to create key i'm gonna create a new one and we'll copy that okay so let's continue we're not actually going to be using the anthropic api key for a little while but i wanted to initialize it quickly now anyway so what we are going to be doing is setting up our agent or everything that our agent needs which is actually quite a few things you have to think okay we need our tool which is going to be our search we need our prompts we need some form of memory because we're going to make a conversational agent here and i think there may be some other thing oh the lm of course so anthropic and yeah you know there's a few things that we need there so let's start with our tool so slightly different syntax here to maybe what i've shown in the past so using now using the tool decorator tool decorator when we use it we need to make sure we pass a description here this description is going to be how the lm decides whether to use this tool or another tool or no tool so we do need something good here something descriptive but concise within our tool so we're going to pass a string query uh we're going to embed that using cohere we're going to search using pinecone making sure that we return our metadata because that will contain our actual plain text and then we return a single string containing all of our responses okay so yeah let's run that we pack that into this tools list and then what we need is a few different formats for this tools list and yeah so we have that and then to so when our agent is actually using the tool it's going to use it like this so it's going to run the tool and then it's going to input a query and let's say our query is can you tell me about llama 2 okay so we're going to be asking those questions again let's see what we get so we we get a good response okay so this is the the output from our tool that our agent may see depending on the question that it that it asks so we now can go and define our xml agent so we come down to here you know i'm describing a little bit what i already described about you know how the xml thing works and here we go so we want to download a prompt so this is a xml agent conversational prompt and you can see here it's like okay you are a helpful assistant and then it tells it about the different tags that it should use the xml tags so on and so on okay so it's the it's what i showed you before and you can also see here that it allows a few inputs so the agent scratch pad it's like it's internal thoughts the input so our query and some tools okay another one that we may use is the chat history now which would end up somewhere around somewhere here chat history gets inside there so we'll need to add that as well now we get to our anthropic chat lm so we initialize this and we want to enter our api key that we copied from before okay so we now have we have our tools we have our lm we have our prompt there's a few more steps that we need so one thing that we need is a way of converting our intermediate steps into text in the correct format so this is what we get all right so this goes into the scratch pad i.e the internal thoughts of the model so it's basically going to take okay the the tool that was decided to be used uh the input to that tool and what it got from that tool okay so this is coming from in the intermediate steps so formats that into a nice string format for the model of the agent we have another one here so this is when it is so for the initial prompt how it will decide to use different tools so we have tool name and that maps to a particular tool description so we also need that so we also need that format and then with that i think we have pretty much everything uh yeah so yeah you can see so this is like the agent logic itself so the input that is going into the agent then we can see the tool descriptions that are being passed into there and then we have this so this is telling our lm when it sees tool input tell it the ending tool input tag or the ending final answer tag it should stop and we should use the xml output agent output parser okay which is just going to pass whatever the agent is generated into something that is usable okay so yeah we we have that one thing i should know is that you could technically remove this but later on what you will see is that the agent when deciding what tools to use and what information passes to those tools will have no context of what's happened before so it's not a very good conversational agent so you basically you do want to have that in there otherwise you're gonna run into issues okay so that's our agent logic definition and now we need to define our agent executor there's a few steps to this i know we're there now so we define our agent executor we pass in the agent logic that we just uh defined we pass in the tools and we're going to set verbose to true so that we can see what's happening when we're running everything and now what we do is we invoke the agent executor we pass out input and chat history we don't have any chat history right now we'll handle that soon and yeah let's see what we get so we're just passing this input in can you tell me about llama 2 and we'll see what happens okay so we can see that uses the archive search tool the input is llama 2 it's a little bit weird because we're dropping that end token but that's fine and then you can see it this blue text here is what is returned from the tool okay so it's the observation then it decides okay i'm going to use the final answer now i'm going to generate well generate a final answer and this is the final answer that it generates okay this is what gets returned to us and we can see here this is the output right based on the information provided so on and so on and okay uh so it does get get the answer okay it's not a hard one to get an answer for so that's good now we would like to add some conversational memory here okay we didn't we don't have any right now we just have chat history it's empty so let's do that we will use a conversational buffer window memory it's like the super basic one there are obviously many other ways of implementing memory as well but this one's just nice and easy and what we will do to start with is create some chat history okay so actually why am i so i want to use a message into here and again we're going to start with no chat history okay so we just get fine lines straight away it doesn't need to use the tool here now we need to extract what we have here and create some chat history with it because right now we haven't connected that conversational memory to our agent whatsoever and we don't connect it directly we instead what we will do is we'll use these methods add user message and add ai message to manually add everything okay but we're going to wrap it up into a nice little function soon so after we do that we'll see that our conversational memory now does have some history in there so we have this okay that's great but what we actually need for our xml agent is conversational memory it looks like this so we need a string in this format and you know we're not far off with this it's it's not exactly hard to pass so let's create a helper function to help us do that so our memory input into this is going to be the conversation buffer window memory object we extract the messages so basically what you see here we're going to extract those and we're going to create a list with human and ai depending on whether it's a human message or not which would be an ai message and then we're just going to join those together to create a single string in the format that we need okay and now if we print that we see that we get the format that we need cool so let's wrap all that into yet another helper function called chat and this is going to help us deal with the state of our agent or keeping maintaining state in our agent so run that and let's continue the conversation now we're going to say can you tell me about llama 2 okay cool so we can see the typical stuff here and it outputs this which is it's actually a pretty nice summary so then we want to continue and i'm going to say was any red teaming done and the reason so the reason i'm asking this is this is a hard question at least it has been in the past with the old data set so we should hopefully get something better now because it's cleaner and we're also using these you know these different agents so let's see what we get now one thing that you will notice here is i'm not saying llama 2 so this the context to this question relies on our conversational memory and you can see that it works right so we decide to use the archive search tool and the the look at the query it's llama 2 red team i didn't mention llama 2 here i mentioned llama 2 in the previous interaction so it's looking at the previous conversational memory and pulling that in to the query and that's good because we actually get some relevant context here so you can see okay risk score output by llama 2 safety reward model on prompts so on and so on okay cool and we can see we come down to here it says yes red teaming was done on llama 2 models to evaluate risk from evaluating generating malicious code so on and so on which is this is the best response i've had on this question so far by by a long shot usually it's pretty bad and maybe that's a data set issue but it's also we're using a good agent here so that is it for this little tutorial on html agents with lang chain v1 we've gone through a few things using a few different models which has been interesting and cool so one we use pinecone serverless which obviously kind of new and interesting we also use anthropic for the llm and we also use coherent bennings and all those together made something that works pretty well in my opinion so yeah that's it for this video i hope this has been useful and interesting but for now i'll leave it there so thank you very much for watching and i will see you again in the next one bye

New LangChain XML Agents

Chapters

Transcript