Local LangGraph Agents with Llama 3.1 + Ollama

Today we are going to be taking a look at how we can build our own local agents using Langraph with a llama now line graph if you don't know it is a open source library from Lang chain that allows us to build agents within a graph like structure and it is currently my Preferred way of building agents whether they are on open AI or local wherever now Alarma is another open source project that allows us to run LLMs locally just very very easily So we're going to be running llama 3.1 the 8 billion parameter model Which is tiny and yet I get like reasonable responses and reasonable actions Coming from my agent, which is actually pretty cool.

So with that, let's jump straight into it So we're going to be running everything locally here. This is slightly different to what I usually do We usually go through like a collab notebook, and I'm sure you probably could run this in a collab notebook but I want to show you how to run this locally on Mac because Generally speaking alarm is it works well with Mac because we have the unified memory which means we can run fairly large models quite easily and Actually really fast as well.

So first thing that we would want to do is go to X This is supposed to be the URL and install a llama. So rather than going to the URL just type install llama Okay, and we have Mac OS Download Mac OS and we just download it. I already have mine downloaded.

You can see the little icon up here So yeah, I'm not going to download it again But once you have downloaded it, you want to run this command in your terminal. So a llama pull llama Actually three point one is the one I'm using not this one Three point one eight B.

So I'm gonna paste that in here a llama pull llama three point one eight B Right, so that has downloaded the model. It will take a little while It's literally I think it just maybe updated the model on my side here as I already had it installed Then we're going to want to set up a Python environment.

I mean, you don't have to do this. It's up to you I would recommend it. So the reason I say that is because we're going to working from this Repository here. So this is the examples Repo, there'll be a link to that in the comments below. So I would one just get clone that So you can copy here Come here and just run get cologne this That will download the entire repo for you, which is it's quite a lot in there to be fair But we only actually need this one bit and then you want to navigate into this directory so if I just show you the full path to where I am from the root of This directory.

So I want to go one more some in learn generation line chain line graph zero to olama line graph agent Okay, there we are now in here Do LS you can see that we have this poetry lock and pyproject.toml file So this all the prerequisites that we need to run this little project Are contained within here, right and the way that we install these is If we go to our readme Well, actually we need to set up our Python environment, so I'm gonna do that.

I'm using conda here. I know people have opinions on which Package manager they like to use I've been using this forever. So I just stick with it So that was that is exactly how I would create my environment there Okay So you can now see that I'm in my new Python environment Then I'm going to go ahead and install poetry.

So pip install poetry then we're going to do poetry install Okay, you probably will not see this when you run this But just in case you can run poetry lock no update to kind of fix that and then run poetry install again Okay So you can see everything has just been installed and then with that we can go over and start running notebook Now the notebook is here again, same little directory here and You can ignore this if you are going with my colab you could install everything I think with this But we are not so What we need to do first is some in VS code, actually, let me close the sidebar And zoom in so in VS code select kernel select another Python environment if you don't see your environment on this list You might have to restart VS code or I'm actually in cursor, but same thing and now I'm going to try again So it's picked up on my environment that I would like to use So now we can go ahead and start running everything so the first thing we're going to need to do is actually what we're gonna be using the reddit API to get some suggestions for our agent our agent is going to be doing is Recommending pizza for me in Rome and the way that's going to do that is it's going to search for something in my case here is going to be searching for good pizza in Rome and It is then going to Decide like if you ask something like what is the best pizza in Rome?

It's going to decide. Okay, I'm gonna go use Reddit to search or the search tool is that's what we tell it We don't tell it necessarily read it and it's going to return that information from a few different submissions on reddit Find what people are saying what their recommendations are and then it's going to tell me where I should go.

So To use the reddit API we do need to sign up for it. You get a certain amount of free Calls every hour I think So we don't need to do I don't need to pay for anything, which is great you can follow this video, which is just I will leave a link again to this in the comments or You don't necessarily need to follow that and said so this is what I always do when I've Done something in the past and I've kind of forgotten how to do it.

I search reddit API and then I just add my name on to the end and then because I Created this in the past. I'm like, okay, this seems to make sense to me Which is I suppose a positive thing of doing all these videos and articles So Come down here and this is quite old This is already three and a half years ago now So it's a while but still everything I think is still up to date so you can go to app preferences here So it's reddit.com slash Press slash apps.

I copy that put in here We go here. I have these I don't even remember what these two were So it's interesting and then I create this one. I Think no. No, sorry. I created this one just now for this video So if I go into here and see all this information here, which I'm going to use so Have my secret key you can try and you can try and use this as well if you like.

See what happens and I have my app key or ID. I'm not sure what is exactly What's it client ID? Okay, and you just put them into here. So here are initializing our Like reddit instance or client use agent. I think you just put the name of your app here It has been quite a while.

I should rewatch my video on it but in any case That's what you do. And then if I scroll down a little bit more I Think yeah here we get into the actual code now the code part is the part that I'm not going to follow because I wanted to do this like from Like using requests rather than using the pro library Which is what I would probably recommend doing.

It's a lot easier and It's what we're going to use here. So you see we're importing Pro, which is the Python reddit API wrapper so we're using that and What we're going to do is basically gather a loads of data. So we're going to be getting Well this information actually so for each submission that we search for and find for a particular question we're going to get the title of that submission the description, so I think that's like a first comment and Then the sort of like more top rated comments that were within the thread of That submission and then I'm going to define and it's variable here, which is basically just going to get called Like let's say this is my object to call the string method here, obviously we just do string right and That will output a more like LM friendly representation of the information there.

Okay, so I'm just Defining that so that it's kind of keep things a little cleaner when we're Building everything here. Otherwise, I'm using dictionaries and whatever else and that's fine Nothing wrong with that, but it's a little bit not clean or not organized. It's probably the right way to put it So let me go through this a little bit I don't want to focus too much on the reddit API stuff here because it's not really what we're here for But just very quickly, right so reddit subreddits We're looking through all of them and then we're searching across reddit best pizza in Rome.

All right, so the We initialize that list we go through we get, you know relevant information. We get all the comments. We include the upvotes I thought this is important and we also filter by the number of upvotes that the comments have The logic here couldn't be better. We could like say, okay.

These are the top rated comments here I don't actually go with the top comments. I'm just going for kind of three of them that have you know, at least 20 upvotes which Works, but I'm you know, it could be better. But anyway, it doesn't matter. This is just Like a quick example, so you can see we run this and it's going through it's like, okay.

I found these submissions This is Rome. Yep, since pizza is an American food. Yeah, it's a lot of Italians here That would not be very happy with that. We can then see what we got out from that So I'm doing the the string method here to get the recommendations.

You can see title Description I was a little disappointed. No after pasta and gelato nice and Yeah, so we have some Recommendations here that they are coming up with this this guy Sisyphus rock. Cool. We have another one here Go to Naples Angering a lot of Romans right now.

And yeah, we have some other ones as well Cool Okay, that's kind of what we're getting out from that tool What we're going to do is just wrap all of that one in a function here. So it's all I'm doing here So go in a particular query. I'm going to rerun all of what we just did again I don't want to go through all this too much.

This is all like, you know set up API stuff It's not really the agent itself. Cool. So we have all of that so in our graph Might help me if I visualize this a little bit. Okay, so we have our Like reddit search API tool thing here. I'm just gonna call it search then we're also going to have If you've watched previous videos from me on Langrath you will see this pattern quite a lot rather than me allowing the agent to Return an answer directly I actually like to use a structured output format and the way that I do that is using a tool Well using a tool usually here.

We're not technically using a tool. We're using the JSON output format I'll talk a bit more about the difference there soon But you can think of this as basically we're using a tool. So we have these two tools Technically kind of not really but it's fine. And then we have our LLM which is I call usually called the Oracle thanks to like the I Think they called it the Oracle in some line chain documentation a long time ago, and I just like the name so I'm sticking with it so the Oracle like the Oracle's is The decision maker right so it makes a decision based on your you know, what is going on, right?

So we're gonna have our query coming in So it's our user query comes in then the Oracle is like, okay based on this query, what do I need to do? Alright, could I just answer the user directly right if I'm just saying? Hello, how are you? Whatever else right?

Let's just like small like small talk. I don't need to use I you know Why would you use the search tool? There's no need So instead the Oracle should be able to just go directly to the final answer tool Okay, so we do give it that option and then if it goes to the final answer tool It's gonna output a structured output format, which is gonna look a little bit like with the answer Which is like a natural language answer and then we have some other parameters.

I think it's like phone number So it's like a phone number for the restaurant if the if the agent has seen that within the data I think honestly using reddit comments. It probably isn't gonna come up with that, but you can you know, we can try and address Okay Again, it's like Street address So that will be output formats, but of course, you know the phone number and address it doesn't need to output that every time Okay, so it's like they're more like optional parameters.

I think in the prompting We just tell the agent to keep those empty if it doesn't know But answer it should provide every time now on the other hand if I ask Okay, tell me where to find the best pizza in Rome in that case. All right, the the Oracle will hopefully Hopefully use a search tool Right.

So when it uses the search tool we go here It will get some information and then it will actually go back with that new information To the Oracle, right? So this is a like it has to go back so that's why I'm making this line a solid line, whereas these lines are dotted because These are like optional like it could go to finance that it could go to search Once it goes to search it has to go back to the Oracle component Then the Oracle component given this new information is like, okay.

Now, what do I need to do? Ideally, it should just go to the final answer every single time Like this is a I think quite a simple agent like it is really not complex whatsoever But again, we're using a tiny LLM for this, right? Llama 3.1 very good, but we're using the 8 billion pounds of version of that which is tiny.

So Honestly the fact that this works at all is actually kind of surprising to me, but pretty cool, right? And it's relatively reliable as well. Like it's there's the odd hallucination There's the odd like going straight to final answer when it should go to search But I don't see those issues all that often.

So it is really not too bad. Okay, cool So now that I've explained that we can jump back into this So we have our final answer tool that we are going to be using okay, so we have all of this so yeah, we Initialize that all this is doing is formatting the output there.

It's not actually doing anything It's just returning like the input to this will literally be the same as the output. So there's nothing Nothing going on there. Really then once we have our to like we have our search function and We have our final answer function. Note that I'm not using line chain tools here and there is a reason for that basically, we're not using all of the line graph or line chain Functions directly here.

We're actually going direct to Olama That's because honestly, I just found the Olama implementation via line chain to be lacking in in some places Particularly with the tool calling which we're not actually using anymore, but with tool calling I couldn't even get to work whatsoever. So because of that I switch to using a llama directly and just suck with it because honestly it There's not really much need to use the the wrapper From line chain for a llama in my opinion.

I kind of prefer doing it this way. So We initialize the agent state I'm not gonna go too into detail on what all of these parts are here because I covered all this I think Pretty well in my previous video on Line graph, so I would just recommend Having a look at that again.

I'll make sure there's a link to that in the description and the in the comments if you are Interested in more but basically this is an object that is persisted within every step of our our agent graph So the LM as I mentioned before the LM is the Oracle.

It's our decision maker So we're just setting it up. We have our system prompt here. Like you are the Oracle AI decision maker You are to provide the user the best possible restaurant recommendation including key information about why it should consider visiting or ordering So and so on I mentioned here returning the address phone number websites now this bits important because so the olama tool calling at the moment is Not that like it works, but you can't force Tool calling and I think because you can't force the tool calling I found the tool calling to be really hit and miss especially when you start adding multiple tools and even more so When you start adding multiple Steps to the agent where it can use one tool and a node tool and node tool, right?

So for example In that agentic flow that I showed you where it uses a search tool and it uses a final answer tool I could not get a llama working where it would use both one after the other so I had to in the end just switch back to the Like JSON formatting and with JSON formatting it works really well You just need to make sure that you prompt it within your system prompt to use the JSON format So that's what I'm doing here, right when using a tool you provide the tool name and the arguments to use in JSON formats You must only use one tool and the response form must always be in the pattern and then you give it that The JSON output for that right here.

I said don't use the search tool more than three times actually, I try and get it not to use it more than once but I Wanted to leave a little bit of flexibility there if we tell it that if it uses more than three times, then we threaten it with nuclear annihilation and That seems to work some of the time.

It's quite Yeah, it's a daring a lamp for sure. Then what I do is after using search tool You must summarize your findings with a finance tool. This is what I wanted to do Okay, so I'm just telling it like giving it as much context as I possibly can Cool Okay, so that is Set up one other thing.

I wanted to get in here is the function schemas or tool schemas. So I'm using some utils from semantic router for this so you can See what we're doing. Hopefully you won't need to this is like a I think a bug in the library at the moment So you should not need to do that soon, but the moment it's there So yeah, you have this and also sorry so this you can use with a llama tool calling That that's what it's built for But you can also just use it to provide like the JSON schema of the tools that you would like the agent to use When you're using JSON mode So it basically you take your function.

So this is a function we described earlier We create this function schema object with it. And then we just use the to a llama method and then we output this Okay, that's that's it. Right? So it's taking the taking your dark string taking your parameters and I think that's basically all we need from that to be honest.

All right, cool And then we also do the same for the final answer. Yeah Okay. Yeah, and that's it. So we we have our Tools that I sell with JSON mode and we can go ahead and actually try using our So model again, one thing I mentioned earlier is that you do need to have run this so a llama pool llama 3.1 8b So let's see 8 billion parameters.

I don't remember what the largest size is, but you can just modify it So I'm pretty sure it's not this size But if it was like 38 billion parameters, you just you just put that in there Pretty pretty simple also the quantization stuff. You can you can put it into there like around here.

So we have model We have messages and format. So this is important. This is what I mentioned before We always want the LM to be outputting in JSON format. So we have that structured Output that we can then process. So yeah, we we do want that that's how we get the tools and Everything to work and then what I do is so we have this get system tools prompt so I'm basically combining the system prompts that we defined earlier and I'm also taking the tools that we have defined here and Then putting them together right in this little Function so here the system prompt the tools which is a list of dictionaries We create a tools string and then we have system prompt few newline characters And then you can use the following tools and then we describe those tools there.

So that is our sort of tool augmented system prompt just passing our tools there and Then this is a simplified version here So I'm trying to say hello there, right and what we should see is when I say hello there It should not use the search tool. It should just use a final answer tool I'm missing something system prompt Okay, I need to run this And run this again Okay, cool.

Let's see what we got. So yep. I went straight to final answer and Okay, you can see what so the final answer outputs everything in a string But we we just pass it. Okay, so we have Message Content. Ah, okay. Perfect So we have the name final answers That's the the tool that we'd like to use and then the parameters that we want to feed in there just like hello I'm here to help you find a great restaurant What kind of cuisine are you in the mood for and then of course phone number and address?

You know that doesn't need to answer those so it just left them as none Okay, cool. Now, let's see if we can get to use the I put web search here. It's actually just it's reddit search I suppose it's a reddit search tool and Hi, I'm looking for the best pizzeria in el Rome, so I'm actually not going to go out because it's a very specific place and I think there's like no one on Reddit talking about pizza there.

So let's just go with Rome Okay, so the agent based on this so you see we have chats We pass all this stuff in and asking for that It said okay, I'm going to use search tool and I'm going to use search tool with this query best pizzeria in Rome Okay, so that work to decide is to use the right tool, which is pretty cool, especially given the model size Okay now We're gonna use a pedantic base model again So we're gonna be using this for the agent actions So the agent actions are well actually what we hit saw him That's this an agent action.

The agent is deciding it's going to use the tool name of search The tool inputs is going to be this dictionary here and then tool output We don't have that yet because we need to run the tool to get that We handle that later in some other a little chunk of code.

So From a llama. So we have the alarm a response again alarm a response never going to include the tool output So we just include tool name and tool input So basically this here why I got here is happening here, right? So we're just passing the alarm a response into this agent action object Then what we are doing here.

We're getting the text so what tools use the tool name the input so the the parameters and If we have the tool output because we add that later We're also going to pass the we're going to return the output. Now we return that text so We can see I'll create that we have now an agent action object tool name search tool input This and tool output none because we we haven't said that yet So that is good.

And why do I care about doing that again? I just want to keep things organized and - when it does come to passing this like multiple steps of where an agent might be doing different things Like it may use search and then it may use the final answer tool or maybe it's going to use search three times hopefully not anymore and Then use the final answer tool.

We want to keep a log of what is happening and the way I've set up here I don't know if this is the best way of doing it with llama 3.1 But the way that I say up here is that it's going to take these Agent actions and it's going to format them into like a single agent action.

It's going to format it into two Messages which makes it appear like it's a conversation happening between the assistant and the user Okay, so it's like the assistant is providing the the function call and then the user is answering based on the Output of that function call. Okay, so Let's see if we can if I can give you an example so action to message or messages Okay, we have our agent action here and then we have the action to message function Okay, so this is just an example, right fake tool name fake query fake output from the function call And we will get this so we're gonna get an assistant message with the inputs and the user message Representing the output and then we're gonna feed that into our agent as it's kind of going through this process of using tools So that's what we're doing here.

So the crate scratch pad is basically handling this conversion for us for multiple actions and Then that scratch pad gets inserted Into here, right? So after the previous like the current user input We then add a little bit of additional logic around that as well so if the stretch pad has been called at least once so there's at least one tool use I You know, it's a small LM so it needs a little bit of extra Guidance, so that is what I've done here.

So I've added basically another user message I append on to the the scratch pad Messages saying okay, please continue as a reminder. My query was this original query the reason I added this is because it tended to I Would find that the the agent would go off and start searching about you know The best food in it would start with Rome and then it would be like, okay what's the best food in LA and then what is the best food in like Austin like it would just kind of Start asking like what is the best food noise different places?

And of course, I don't want it to do that So I'm just reminding it again. Look, this is my original query. This is what I want to know about So yeah, I found that to be relatively important for this model and like only answer the original query and nothing else So I'm trying to encourage you to not Kind of view those other messages as something that it should respond to Right that the kind of fake messages I created by the scratch pad again there's probably a better way of doing that but it's just you know for this example and Then another thing that I found is that it would be quite Loose on details in the answer field Like I wanted it to give me a bit of more of like a human sounding description like oh you should try this because you know X Y & Z and you should try this other place because so on and so on and what I would find is I'd be like, hey, you should try X and That would be it.

All right, and so it was like not very interesting So I added this a little bit of prompting here and that seemed to improve things Then I just asked it to remember to leave the contact details are prompting looking restaurant if possible now another thing that I still found it was doing even after adding all of this is It would maybe not search for what is the best food in LA or what was the best food in New York?

So on so on but it might start saying okay. What is the best food in Rome? cool, what is the most recommended food in Rome like it or even just repeat the exact same query again, so Added another little bit to the scratchpad as soon as it has used the search tool to say you must now use the final answer Tool to kind of be like, okay, just use the finance tool stop being and stop using the search tool So yeah that helps it does limit the agent a little bit in Okay, maybe using the search tool a few times to try different search terms, but I found it didn't really need that Anyway, so this was fine.

Then. Yeah, we put everything together as we did before so we have the system prompt as before the chat history the users query and then the scratchpad and then yeah, we we we just make the query when I remove this and Yeah, we return the action agent action run this cool, so we're gonna try the Chat LM function, which is actually this one.

We just went through and yeah, we create some like fake chat history So hi there. How you I'm currently Rome So actually one important thing here look that I mentioned I'm currently in Rome and then I'm like, hey, I'm looking for the best pizzeria near me Right, so I'm mentioning this in the history like my current location and then I'm asking for the pizzeria So I'm just testing here that okay chat history is actually considered.

It's important Okay, so you see that sometimes it's not perfect So this time it decides to go with final answer straight away, and then we can see that the chat history was Considered so considering your location room your desire. Try a local pizzeria. I would recommend trying out pizza la Monte Carlo now this doesn't exist or I think it's a pizza pizzeria in Switzerland because it kept recommending me this all the time I've been through many iterations of getting this to work so Yeah, that was a hallucination.

But then we run it one more time and it worked okay, so there's a little bit, you know, it could do a little bit of work in some places, but Second time it works best pizza in their room. That's what I'm looking for. Okay using the search prompt tool now That is our core LM function.

I'm gonna be getting into the the graph stuff in a moment, but let's try Let's try taking this and feeding it into our search function and seeing what we get Cool, so that looks pretty relevant. I think very similar results to what we would have seen earlier as well Yeah, it's the room best piece of my life Bubble farts rock with 202 up votes.

I Recently traveled to Italy as well the week after I came back Hardcore cravings took pizza from there Decent-sized city and yeah, Rome. Italy has good pizza. I agree So we have those results now what we've just done that we've kind of set up all the core logic of the different components of our Graph base agent, but we need to like put everything together We need to connect everything which we haven't started to do yet.

So To do that. We are going to sell a few Components that are going to be there almost like wrappers for our functions That will be used within the graph itself And the reason we need these wrappers is because we're actually using this state object Remember the agent state from earlier that gets passed directly into here It's not true that this is a list.

My my typing is is wrong there Let me just check what it actually why actually is okay, it's a typed date so let me take that and fix that quickly, okay, so tactic here here and Here Okay, cool. So let's just have a quick look at these run Oracle So this is just running our LLM.

Okay, so call LLM check history state So, okay Yeah, I mean this with that is what it is we already went through to call LLM function we then have our router so If you remember here, right the Oracle can make it can go one of two ways It can go to finance or it can go to search the way that this is handled is actually there's more like a there's like an intermediate step here, which is the router, okay, and That is like okay based on what the Oracle outputs I will send you to one of these two directions one of these two places So that is what this is doing and it also so we also include some error handling in here, so if we see a tool name that we don't recognize We go directly to the final answer.

Okay, which this might not be the best error handling. Actually. I'm not sure if it would work I don't know if I handle it or not, but it yeah, it doesn't really matter. I haven't seen it fail So it's it's okay Then we have this dictionary here Which will go from so if we see the term search it we know we need to use the search function if we see the time Final answer we know we need to use a final answer function and we use that in here so we have this tool string to function which is provided with the tool name based on the output from the Oracle and Based on that we're gonna get the function, right?

So if it passes in the search string from the Oracle here this What I'm highlighting right here is going to become the search function and then in the search function We're going to pass in the tool arguments from the Oracle, right then from that we're going to get everything we need to construct a agent action and Then if the tool name is final and so we output in this in this format I'm not too sure why we need to do that.

I will leave it but maybe just Question that I'm not I'm not sure if it's needed or Otherwise, we're going to add to the intermediate steps our action out. We should see. Yeah What we've output right so if we're using search tool We're going to add that to our intermediate steps the way that the say itself here is that this Single item here doesn't replace the entirety of the intermediate steps They actually gets added to the intermediate steps a little bit weird in terms of syntax in my opinion But that is actually what is happening Just to make you aware So we have that I need to run this Then we have You know our components for the graph.

They're ready. We can construct the graph Okay, so we initialize the graph with our agent state So that earlier state object, which is a typed date not a list Then we add some nodes to our graph. So we have our Oracle our search our final answer What that will look like is literally this here ignoring the little router in the middle that does exist it you know it just isn't included in the nodes and the reason it doesn't exist in the nodes is that it actually exists is more of Like a within this conditional edge object here.

So the conditional edge is basically that's like the dotted line right, where is the The actual edges down here from you know, the scalar jaw drawing I did are like actual lines It kind of looks like a dot line as well, but I don't know how to do it.

How can I do it? Oh, there we go. Perfect. So that is what that would look like Versus that okay So that's a conditional edge. So that's going from the Oracle based on the router logic, which is like, okay Go to search or go to final answer The other thing that I'm missing here is the entry point of our graph, which is the Oracle So that's a starting point that we go to so that's where we insert the query Yeah Then we create our actual edges so the actual edges it's only okay, so here we're only adding the edge from the Search schema back to the Oracle All right, and in reality, I don't even need this bit here.

I don't believe Yeah, because I can leave this here But then we're gonna see if tool name is not equal to final answer. We add the edge. So Honestly, I don't even know why I have that I can just remove it and I can even remove that but Hey, it's fine.

Whatever. I don't want to break it now so once something does go to the final answer tool the final answer tool as We see in our graph here has one line coming out of it, which is answer, right? It's it's actually go to the end block Right in the end block is kind of like here and then the output from that is is this okay?

And that's what we're doing there and then once all that is done we can compile our graph and if everything is set up in a Functional way it will compile and we will not get an error So yeah, that is our graph I have this little bit of code here that we can use in order to visualize Everything or visualize our graph.

So I'm gonna take that let's pull it in and It is Okay See what we? See what we get. So we have this basically So the Oracle I think line graph is always adding this extra line here this optional line Because it is Basically allowing the LLM if it decides to to return a direct answer I think that's what that is But in reality, we've prompted it not to do that and we try to sell so it doesn't but this is what we get All right So it's our entry point it goes to the Oracle Oracle can go to final answer or search if it goes search it goes back to the Oracle and Then it would go to final answer and we end Okay, and that that is our graph super simple, but again, we're using a tiny model, so Something overly complicated probably won't work So I'm gonna say with our graph agent, where is the best pizza in Rome and let's see what it comes up with Okay So we see it Invoked the search tool here So query was best pizza in Rome.

It got these three submissions from reddit It then went back to the Oracle which went to the router the router identified that it should go to the final answer or sorry the Oracle decided this the router identified that and we went to the final answer tool and then we have our Outputs.

Okay based on your question. I would recommend trying I keep getting this recommendation. I Will actually try it and see how it is It looks kind of interesting is is Not what I would actually expect and it also isn't a Roman pizza. It's a Neapolitan pizza which Yeah, I'm sure the Romans would not be happy that their top recommended pizza in Rome is seems to be in a The Naples style pizza, and I also tried this.

So where is the best gluten-free pizza in Rome? I'm not gluten-free but My girlfriend is so I thought okay, let's see and Let me see Let's see what we get. I didn't get a good response before Yeah So here's like unfortunately, I was unable to find specific recommendations of gluten-free pizza in Rome It seems like it's only generally considered a good destination for gluten-free options generally actually it is Oh Pizza and Trevi maybe we can we you know, we can try that.

Oh, there we go So mark, so we did have some options here. I don't know why Expensive So pizza and Trevi. Let's have a look. Okay, so pizza in Trevi. Let's see Where you are? Okay looks Interesting. So near the center Maybe it's a good option Gluten-free beer gluten-free pizza.

There we go. So That is an option The other one we can also try it's so pizza Bonatti Which is really interesting looking Here so a little further out just south of Trastevere and the center over here But if you look they have some kind of cool-looking pizza Like what I would not usually expect to find in Italy like this unique for sure But Looks good.

So it seems like our fully local Pizza recommending agent does work. It's not perfect, but it does work and it has some good recommendations there again I will just point out that This is a tiny LLM Alright, so 8 billion parameters Running fully local on my m1. I think MacBook so I don't have any sort of powerful thing running this and Because of the sort of limitations of the memory on my on my MacBook I can only run the smaller models like the 8 billion parameter model But you could also see how quickly that was responding like it did not take long to go through multiple like agent steps Perform a search or read everything from that search and producing an answer to us and the answers were generally pretty good So honestly, I I think it's actually quite impressive that you could do that for sure The Jason mode seems to be the best approach for agents at least for now I know that the LLM a team are planning to add force function calling which I think will make a difference But for now Jason mode works perfectly.

And yeah, so that's how we would build local agent using line graph and Llama, so that's it for this video. I hope this has been useful and interesting, but I'll leave it there for now So thank you very much for watching and I will see you again in the next one.

Bye You You You You You You you

Local LangGraph Agents with Llama 3.1 + Ollama

Chapters

Transcript