End-to-end AI Agent Project with LangChain

00:00:00.000 | Now we're on to the final capstone chapter.

00:00:02.760 | We're going to be taking everything that we've learned so far and using it to

00:00:08.240 | build a actual chat application.

00:00:11.820 | Now the chat application is what you can see right now.

00:00:14.900 | And we can go into this and ask some pretty interesting questions.

00:00:19.260 | And because it's an agent, because as I access these tools, it will

00:00:22.020 | be able to answer them for us.

00:00:23.700 | So we'll see inside our application that we can ask questions that require tool

00:00:28.920 | use, such as this.

00:00:30.060 | And because of the streaming that we've implemented, we can see all this

00:00:33.680 | information in real time.

00:00:34.740 | So we can see that serve API tool is being used, that these are the queries.

00:00:38.400 | We saw all that was in parallel as well.

00:00:40.600 | So each one of those tools were being used in parallel.

00:00:44.180 | We've modified the code a little bit to enable that.

00:00:47.220 | And we see that we have the answer.

00:00:50.140 | We can also see the structured output being used here.

00:00:53.160 | So we can see our answer followed by the tools used here.

00:00:57.420 | And then we could ask follow up questions as well.

00:00:59.340 | Because this is conversational.

00:01:00.540 | So we say, how is the weather in each of those cities?

00:01:06.360 | Okay.

00:01:14.520 | That's pretty cool.

00:01:16.000 | So this is what we're going to be building.

00:01:18.340 | We are, of course, going to be focusing on the API, the back end.

00:01:22.300 | I'm not a front end engineer, so I can't take you through that, but the code is there.

00:01:26.820 | So for those of you that do want to go through the front end code, you can, of course, go and

00:01:31.260 | do that.

00:01:31.840 | But we'll be focusing on how we build the API that powers all of this using, of course,

00:01:37.540 | everything that we've learned so far.

00:01:39.280 | So let's jump into it.

00:01:40.680 | The first thing I'm going to want to do is clone this repo.

00:01:43.760 | So we'll copy this URL.

00:01:45.940 | This is repo, Aurelio Labs, Linechain course.

00:01:48.960 | And you just clone your repo like so.

00:01:53.920 | I've already done this, so I'm not going to do it again.

00:01:56.400 | Instead, I'll just navigate to the Linechain course repo.

00:02:01.540 | Now, there's a few set up things that you do need to do.

00:02:05.100 | All of those can be found in the README.

00:02:07.940 | So we just open a new tab here.

00:02:11.780 | And I'll open the README.

00:02:14.940 | Okay, so this explains everything we need.

00:02:17.380 | We have, if you were running this locally already, you will have seen this or you will have already

00:02:22.920 | done all of this.

00:02:23.680 | But for those of you that haven't, we'll go through it quickly now.

00:02:27.340 | So you will need to install the UV library.

00:02:31.840 | So this is how we manage our Python environment, our packages.

00:02:36.620 | We use UV.

00:02:38.140 | On Mac, you would install it like so.

00:02:41.540 | If you're on Windows or Linux, just double check how you'd install over here.

00:02:47.580 | Once you have installed this, you would then go to install Python.

00:02:53.620 | So UV Python install.

00:02:55.820 | Then we want to create our vnv, our virtual environment, using that version of Python.

00:03:02.320 | So vvnv here.

00:03:04.760 | Then, as we can see here, we need to activate that virtual environment, which I did miss from here.

00:03:12.260 | So let me quickly add that.

00:03:14.460 | So you just run that.

00:03:16.340 | For me, I'm using fish.

00:03:17.760 | So I just add fish onto the end there.

00:03:20.100 | But if you're using bash or ZSH, I think you can just run that directly.

00:03:23.940 | And then, finally, we need to sync, i.e. install all of our packages using UV Sync.

00:03:31.320 | And you see that will install everything for you.

00:03:34.560 | Great.

00:03:36.160 | So we have that and we can go ahead and actually open cursor or VSCode.

00:03:43.540 | And then we should find ourselves within cursor or VSCode.

00:03:47.900 | So in here, you'll find a few things that we will need.

00:03:53.220 | So first is environment variables.

00:03:56.020 | So we can come over to here and we have OpenAI API key, Learning Chain API key, and Serp API API key.

00:04:03.160 | Create a copy of this and you'd make this your .env file.

00:04:09.160 | Or if you want to run it with Source, you can, well, I like to use mac.env when I'm on Mac and I just add export onto the start there and then enter my API keys.

00:04:21.340 | Now, I actually already have these in this local.mac.env file, which over in my terminal, I would just activate with Source again like that.

00:04:31.540 | Now, we'll need that when we are running our API and application later.

00:04:37.520 | But for now, let's just focus on understanding what the API actually looks like.

00:04:43.020 | So navigating into the 09 capstone chapter, we'll find a few things.

00:04:48.540 | What we're going to focus on is the API here.

00:04:52.500 | And we have a couple of notebooks that help us just understand, okay, what are we actually doing here?

00:04:58.640 | So let me give you a quick overview of the API first.

00:05:02.900 | So the API, we're using FastAPI for this.

00:05:05.500 | We have a few functions in here.

00:05:07.740 | The one we'll start with is this.

00:05:10.180 | Okay, so this is our post endpoint for invoke.

00:05:13.580 | And this essentially sends something to our LLM and begins a streaming response.

00:05:19.740 | So we can go ahead and actually start the API and we can just see what this looks like.

00:05:24.740 | So we'll go into chapters 09 capstone API after setting our environment variables here.

00:05:32.080 | And we just want to do uv run uvicorn main colon app reload.

00:05:38.140 | We don't need to reload, but if we're modifying the code, that can be useful.

00:05:42.300 | Okay, and we can see that our API is now running on localhost port 8000.

00:05:48.100 | And if we go to our browser, we can actually open the dops for our API.

00:05:54.940 | So we go to 8000 slash dops.

00:05:58.040 | Again, we just see that we have that single invoke method.

00:06:01.140 | It strikes the content.

00:06:04.280 | And it gives us a small amount of information there.

00:06:07.420 | Now, we could try it out here.

00:06:09.420 | So if we say, say hello, we can run that.

00:06:14.660 | And we'll see that we get a response.

00:06:19.200 | We get this.

00:06:20.200 | Okay.

00:06:21.060 | Now, the thing that we're missing here is that this is actually being streamed back to us.

00:06:25.900 | Okay.

00:06:26.980 | So this is not a just a direct response.

00:06:29.800 | This is a stream.

00:06:30.940 | To see that, we're going to navigate over to here, to this streaming test notebook.

00:06:35.580 | And we'll run this.

00:06:38.760 | So we are using requests here.

00:06:41.240 | We are not just doing a, you know, the standard post request because we want to stream the output

00:06:48.040 | and then print the output as we are receiving them.

00:06:52.120 | Okay.

00:06:53.020 | So that's why this looks a little more complicated than just a typical request.post or request.get.

00:06:58.580 | So what we're doing here is we're starting our session, which is our post request.

00:07:05.940 | And then we're just iterating through the content as we receive it from that request.

00:07:11.500 | When we receive a token, because sometimes this might be none, we print that.

00:07:16.960 | Okay.

00:07:17.700 | And we have that flush equals true as we have used in the past.

00:07:20.640 | So let's define that.

00:07:22.980 | And then let's just ask a simple question.

00:07:24.940 | What is five plus five?

00:07:26.380 | Okay.

00:07:29.840 | And we, we saw that that was, it was pretty quick.

00:07:32.040 | So it generated this response first, and then it went ahead and actually continued streaming

00:07:39.000 | with all of this.

00:07:40.640 | Okay.

00:07:42.020 | And we can see that there are these special tokens that are being provided.

00:07:45.180 | This is to help the front end basically decide, okay, what should go where.

00:07:51.280 | So here where we're showing these multiple steps of tool use and the parameters, the way the front

00:07:59.760 | end is deciding how to display those is it's just, it's being provided a single stream, but

00:08:06.000 | it has the step tokens has a step has set name.

00:08:09.420 | Then it has the parameters followed by the sort of ending of the step token.

00:08:14.200 | And it's looking at each one of these.

00:08:15.960 | And then the one step name that it treats differently is where it will see the final answer step name.

00:08:23.220 | When it sees the final answer step name, rather than displaying this tool use interface, it instead

00:08:28.260 | begins streaming the tokens directly at like a typical chat interface.

00:08:33.000 | And if we look at what we actually get in our final answer, it's not just the answer itself, right?

00:08:39.660 | So we have the answer here.

00:08:41.700 | This is streamed into that typical chat output, but then we also have tools use.

00:08:48.160 | And then this is added into the little boxes that we have below the chat here.

00:08:54.400 | So it's quite a lot going on just within this little stream.

00:08:58.120 | Now we can try with some other questions here.

00:09:01.440 | So we're going to say, okay, tell me about the latest news in the world.

00:09:04.540 | You can see that there's a little bit of a wait here whilst it's waiting to get the response.

00:09:08.220 | And then yeah, that's streaming a lot of stuff quite quickly.

00:09:11.840 | Okay.

00:09:12.980 | So there's a lot coming through here.

00:09:15.220 | Okay.

00:09:15.740 | And then we can ask other questions like, okay, this one here, how called is it in Oslo right now?

00:09:20.660 | Or is five multiplied by five?

00:09:22.520 | All right.

00:09:23.100 | So these two are going to be executed in parallel and then it will, after it has the answers for

00:09:29.180 | those, the agent will use the other multiply tool to multiply those two values together and

00:09:34.880 | all of that will get streamed.

00:09:36.340 | Okay.

00:09:37.420 | And then as we saw earlier, we have the, what is the current date and time in these places?

00:09:42.440 | Same thing.

00:09:43.360 | So three questions, there are three questions here.

00:09:46.360 | What is the current date and time in Dubai?

00:09:47.980 | What's the current date and time in Tokyo?

00:09:49.660 | And what's the current date and time in Berlin?

00:09:52.180 | Those three questions get executed in parallel against the API search tool.

00:09:57.640 | And then all answers get returned within that final answer.

00:10:02.060 | Okay.

00:10:02.740 | So that is how our API is working.

00:10:06.720 | Now let's dive a little bit into the code and understand how it is working.

00:10:12.600 | So there are a lot of important things here.

00:10:15.700 | There's some complexity, but at the same time, we've tried to make this as simple as possible

00:10:20.420 | as well.

00:10:21.040 | So let's just fast API syntax here with the app post invoke.

00:10:26.340 | So to start invoke endpoint, we consume some, some content, which is a string.

00:10:31.140 | And then if you remember from the agent executed deep dive, which is what we've implemented here

00:10:37.440 | or a modified version of that, we have to initialize our async IO queue and our streamer,

00:10:45.120 | which is the queue callback handler, which I believe is exactly the same as what we defined

00:10:50.120 | in that earlier chapter.

00:10:51.240 | There's no differences there.

00:10:52.980 | So we define that.

00:10:54.560 | And then we return this streaming response object.

00:10:58.600 | All right.

00:10:59.280 | Again, this is a fast API thing.

00:11:00.840 | This is so that you are streaming a response.

00:11:03.380 | That streaming response has a few attributes here, which again are fast API things or just

00:11:09.840 | generic API things.

00:11:11.640 | So some headers giving instructions to the API and then the media type here, which is text

00:11:17.660 | event stream.

00:11:18.420 | You can also use, I think it's text plane possibly as well, but I believe this standard here would

00:11:25.300 | be to use event stream.

00:11:27.020 | And then the more important part for us is this token generator.

00:11:31.160 | Okay.

00:11:31.960 | So what is this token generator?

00:11:34.240 | Well, it is this function that we defined up here.

00:11:37.620 | Now, if you, again, if you remember that earlier chapter at the end of the chapter, we sell a, a for loop where we were printing out different tokens in various formats.

00:11:50.040 | So we kind of post-processing them before deciding how to display them.

00:11:55.440 | That's exactly what we're doing here.

00:11:57.100 | So in this block here, we're looping through every token that we're receiving from our streamer.

00:12:07.220 | We're looping through and we're just saying, okay, if this is the end of a step, we're going to yield this end of step token, which we, we saw here.

00:12:16.780 | Okay.

00:12:17.260 | So it's this end of, end of step token there.

00:12:20.180 | Otherwise, if this is a tool call.

00:12:23.360 | So again, we've got that water operator here.

00:12:25.940 | So what we're doing is saying, okay, get the tool calls out from our current message.

00:12:31.780 | If there is something there.

00:12:33.600 | So if this is not none, we're going to execute what's inside here.

00:12:37.660 | And what is being executed inside here is we're checking for the tool name.

00:12:41.700 | If we have the tool name, we return this.

00:12:45.240 | Okay.

00:12:45.420 | So we have the start of step token, the start of the step name token, the tool name or step name, whichever of those you want to call it.

00:12:54.820 | And then the end of the step name token.

00:12:57.920 | Okay.

00:12:58.880 | And then this, of course, comes through to the front end like that.

00:13:04.260 | Okay.

00:13:04.760 | That's what we have there.

00:13:05.980 | Otherwise, we should only be seeing the tool name returned as part of first token for every step.

00:13:11.620 | After that, it should just be tool arguments.

00:13:14.640 | So in this case, we say, okay, if we have those tool or function arguments, we're going to just return them directly.

00:13:21.700 | So then that is a part that would stream all of this here.

00:13:25.040 | Okay.

00:13:25.820 | Like these would be individual tokens, right?

00:13:28.680 | For example.

00:13:29.200 | Right.

00:13:29.760 | So we might have the open curly brackets followed by query could be a token.

00:13:34.920 | Latest could be a token.

00:13:36.920 | World could be a token.

00:13:38.080 | News could be a token, et cetera.

00:13:39.620 | Okay.

00:13:40.800 | So that is why it's happening there.

00:13:42.120 | This should not get executed, but we have a, we just handle that just in case.

00:13:47.840 | So we have any issues with tokens being returned there.

00:13:51.460 | We're just going to print this error and we're going to continue with the streaming, but that should not really be happening.

00:13:59.140 | Cool.

00:13:59.920 | So that is our token streaming loop.

00:14:04.140 | Now, the way that we are picking up tokens from our stream object here is of course, through our agent execution logic, which is happening in parallel.

00:14:14.240 | Okay.

00:14:15.140 | So all of this is asynchronous.

00:14:16.580 | We have this async definition here.

00:14:18.360 | So all of this is happening asynchronously.

00:14:21.380 | So what has happened here is here, we have created a task, which is the agent executor invoke.

00:14:29.420 | And we passing our content, we're passing that streamer, which we're going to be pulling tokens from.

00:14:33.900 | And we also set verbose to true.

00:14:36.620 | We can actually remove that, but that would just allow us to see additional output in our terminal window if we want it.

00:14:44.600 | I don't think there's anything particularly interesting to look at in there, but particularly if you are debugging, that can be useful.

00:14:53.360 | So we create our task here, but this does not begin the task.

00:14:58.740 | All right.

00:14:59.220 | This is a, it's async IO create task, but this does not begin until we await it down here.

00:15:05.840 | So what is happening here is essentially this code here is still being run or in like a, we're in an asynchronous loop here, but then we await this task.

00:15:17.000 | As soon as we await this task, tokens will start being placed within our queue, which then get picked up by the streamer object here.

00:15:25.400 | So then this begins receiving tokens.

00:15:28.800 | I know async code is always a little bit more confusing given the strange order of things, but that is essentially what is happening.

00:15:39.380 | You can imagine all of this is essentially being executed all at the same time.

00:15:43.500 | So we have that.

00:15:45.560 | Is there anything else to go through here?

00:15:47.460 | I don't think so.

00:15:48.720 | It's all sort of boilerplate stuff for FastAPI rather than the actual AI code itself.

00:15:53.600 | So we have that as our streaming function.

00:15:57.300 | Now let's have a look at the agent code itself.

00:16:01.240 | Okay.

00:16:02.380 | So agent code, where would that be?

00:16:04.460 | So we're using this agent executor invoke and we're importing this from the agent file.

00:16:11.320 | So we can have a look in here for this.

00:16:14.380 | Now you can see straight away, we're pulling in our API keys here.

00:16:18.080 | Just make sure that you do have those.

00:16:21.980 | Now, all of our cell.

00:16:23.680 | Okay.

00:16:24.480 | This is what we've seen before in that agent executor deep dive chapter.

00:16:30.180 | This is all practically the same.

00:16:33.040 | So we have our LLM.

00:16:36.100 | We've set those configurable fields as we did in the earlier chapters.

00:16:40.460 | That configurable field is for our callbacks.

00:16:42.900 | We have our prompt.

00:16:44.280 | This has been modified a little bit.

00:16:46.500 | So essentially just telling it, okay, make sure you use the tools provided.

00:16:52.020 | We say you must use the final answer tool to provide a final answer to the user.

00:16:56.560 | And one thing that I added that I noticed every now and again.

00:16:59.760 | So I have explicitly said use tools answer the user's current question, not previous questions.

00:17:06.200 | So I found with this setup, it will occasionally, if I just have a little bit of small talk with

00:17:13.300 | the agent and beforehand, I was asking questions about, okay, like what was the weather in this

00:17:18.240 | place or that place, the agent will kind of hang on to those previous questions and try and use

00:17:23.100 | a tool again to answer.

00:17:25.340 | And that is just something that you can more or less prompt out of it.

00:17:28.880 | Okay.

00:17:29.800 | So we have that.

00:17:30.780 | This is all exactly the same as before.

00:17:32.900 | Okay.

00:17:33.320 | So we have our chat history to make this conversational.

00:17:35.960 | We have our human message and then our agent scratch pad.

00:17:39.460 | so that our agent can think through multiple tool use messages.

00:17:43.900 | Great.

00:17:44.840 | So we also have the article class.

00:17:48.040 | So this is to process results from SERP API.

00:17:52.220 | We have our SERP API function here.

00:17:56.060 | I will talk about that a little more in a moment, because this is also a little bit

00:17:59.580 | different to what we covered before.

00:18:00.880 | What we covered before with SERP API, if you remember, was synchronous because we were using

00:18:07.900 | the SERP API client directly or the SERP API tool directly from Langchain.

00:18:13.840 | And because we want everything to be asynchronous, we have had to recreate that tool in a

00:18:21.920 | asynchronous fashion, which we'll talk about a little bit later.

00:18:25.980 | But for now, let's move on from that.

00:18:28.060 | We can see our final answer being used here.

00:18:31.540 | So this is, I think we defined the exact same thing before, probably in that deep dive chapter

00:18:37.420 | again, where we have just the answer and the tools that have been used.

00:18:41.340 | Great.

00:18:42.800 | So we have that.

00:18:44.040 | One thing that is a little different here is when we are defining our name to tool function.

00:18:52.380 | So this takes a tool name and it maps it to a tool function.

00:18:57.940 | So when we have synchronous tools, we actually use tool funk here.

00:19:03.980 | Okay.

00:19:04.840 | So rather than tool coroutine, it would be tool funk.

00:19:08.040 | However, we are using asynchronous tools.

00:19:12.460 | And so this is actually tool coroutine.

00:19:15.580 | And this is why, this is why if you, if you come up here, I've made every single tool asynchronous.

00:19:22.180 | Now that is not really necessary for a tool like final answer because there is no, there's no API

00:19:29.460 | calls happening.

00:19:30.900 | API call is a very typical scenario where you do want to use async.

00:19:34.740 | Because if you make an API call with a synchronous function, your code is just going to be waiting

00:19:40.420 | for the response from the API while the API is processing and doing whatever it's doing.

00:19:46.480 | So that is an ideal scenario where you would want to use async because rather than your code

00:19:53.640 | just waiting for the response from the API, it can instead go and do something else whilst it's

00:19:59.640 | waiting.

00:20:00.020 | All right.

00:20:00.460 | So that's an ideal scenario where you'd use async, which is why we would use it, for example,

00:20:04.880 | with a SERP API tool here.

00:20:06.320 | But for final answer and for all of these calculator tools that we've built, there's actually no need

00:20:14.780 | to have these as async because our code is just running through its executing this code.

00:20:21.180 | There's no waiting involved.

00:20:23.020 | So it doesn't necessarily make sense to have these asynchronous.

00:20:26.200 | However, by making them asynchronous, it means that I can do tool coroutine for all of them

00:20:32.520 | rather than saying, oh, if this tool is synchronous, use tool.funk.

00:20:37.920 | Whereas if this one is async, use tool.coroutine.

00:20:41.220 | So it just simplifies the code for us a lot more.

00:20:44.040 | But yeah, not directly necessary, but it does help us write cleaner code here.

00:20:50.340 | This is also true later on because we actually have to await our tool calls, which we can see

00:20:58.360 | over here.

00:20:59.640 | All right.

00:21:00.180 | So we have to await those tool calls.

00:21:01.820 | That would get messier if we were using the like some sync tools, some async tools.

00:21:09.360 | So we have that.

00:21:10.840 | We have our Q callback handler.

00:21:13.040 | This is, again, that's the same as before.

00:21:15.140 | So I'm not going to go through.

00:21:17.320 | I'm not going to go through that.

00:21:18.820 | We covered that in the earlier deep dive chapter.

00:21:21.080 | We have our execute tool function here.

00:21:23.780 | Again, that is asynchronous.

00:21:25.180 | This just helps us, you know, clean up code a little bit.

00:21:28.840 | This would, I think in the deep dive chapter, we had this directly placed within our agent

00:21:35.120 | executor function and executor function and you can do that.

00:21:37.040 | It's fine.

00:21:37.660 | It's just a bit cleaner to kind of pull this out.

00:21:40.520 | And we can also add more type annotations here, which I like.

00:21:44.100 | So execute tool expects us to provide an AI message, which includes a tool call within it.

00:21:49.900 | And it will return us a tool message.

00:21:52.960 | Okay.

00:21:54.180 | Agent executor.

00:21:56.300 | This is all the same as before.

00:21:58.680 | And we're actually not even using verbose here.

00:22:01.440 | So we could fully remove it, but I will leave it.

00:22:03.680 | Of course, if you would like to use that, you can just add a verbose and then log or print some stuff where you need it.

00:22:11.580 | Okay.

00:22:11.860 | So what do we have in here?

00:22:14.140 | We have our streaming function.

00:22:15.460 | So this is what actually calls our agent, right?

00:22:21.680 | So we have a query.

00:22:22.440 | This will call our agent just here.

00:22:25.620 | And we could even make this a little clearer.

00:22:28.560 | So for example, this could be configured agent, because this is, this is not the response.

00:22:35.640 | This is a configured agent.

00:22:36.940 | So I think this is maybe a little clearer.

00:22:39.160 | So we are configuring our agent with our callbacks.

00:22:42.540 | Okay.

00:22:43.160 | Which is just our streamer.

00:22:44.300 | Then we're iterating through the tokens are returned by our agent using a stream here.

00:22:50.660 | Okay.

00:22:51.640 | And as we are iterating through this, because we pass our streamer to the callbacks here,

00:22:58.500 | what that is going to do is every single token that our agent returns is going to get processed

00:23:07.100 | through our queue callback handler here.

00:23:09.960 | Okay.

00:23:10.880 | So this on lm new token, on lm new token, these are going to get executed.

00:23:17.180 | And then all of those tokens, you can see here, are passed to our queue.

00:23:21.560 | Okay.

00:23:22.740 | Then we come up here and we have this a iter.

00:23:25.180 | So this a iter method here is used by our generator over in our API is used by this token generator

00:23:33.680 | to pick up from the queue, the tokens that have been put in the queue by these other methods here.

00:23:44.080 | Okay.

00:23:44.280 | So it's putting tokens into the queue and pulling them out with this.

00:23:49.500 | Okay.

00:23:51.600 | So that is just happening in parallel as well as this code is running here.

00:23:56.560 | Now, the reason that we extract the tokens out here is that we want to pull out our tokens

00:24:01.620 | and we append them all to our outputs.

00:24:04.600 | Now, those outputs, that becomes a list of AI messages, which are essentially the AI telling

00:24:11.760 | us what tool to use and what parameters to pass to each one of those tools.

00:24:16.960 | This is very similar to what we covered in that deep dive chapter.

00:24:20.500 | But the one thing that I have modified here is I've enabled us to use parallel tool calls.

00:24:28.840 | So that is what we see here with this, these four lines of code.

00:24:33.920 | We're saying, okay, if our tool call includes an ID, that means we have a new tool call or a

00:24:40.200 | new AI message.

00:24:41.780 | So what we do is we append that AI message, which is the AI message chunk to our outputs.

00:24:48.820 | And then following that, if we don't get an ID, that means we're getting the tool arguments.

00:24:53.420 | So following that, we're just adding our AI message chunk to the most recent AI message chunk from

00:25:01.820 | our outputs.

00:25:02.560 | Okay.

00:25:03.060 | So what that will do is it will create that list of AI messages would be like AI message

00:25:11.300 | one, and then this will just append everything to that AI message one.

00:25:16.940 | Then we'll get our next AI message chunk.

00:25:20.020 | This will then just append everything to that until we get a complete AI message and so on and

00:25:26.100 | so on.

00:25:27.080 | So what we do here is here, we've collected all our AI message chunk objects.

00:25:34.840 | Then finally, what we do is just transform all those AI message chunk objects into actual

00:25:40.320 | AI message objects, and then return them from our function, which we then receive over here.

00:25:45.920 | So into the tool calls variable.

00:25:49.020 | Now, this is very similar to the deep dive chapter.

00:25:52.880 | Again, we're going through that count, that loop, where we have a max iterations, at which

00:25:58.240 | point we will just stop.

00:25:59.360 | But until then, we continue iterating through and making more tool calls, executing those

00:26:06.260 | tool calls, and so on.

00:26:07.480 | So what is going on here?

00:26:10.140 | Let's see.

00:26:10.540 | So we got our tool calls.

00:26:12.380 | This is going to be a list of AI message objects.

00:26:16.020 | Then, what we do with those AI message objects is we pass them to this execute tool function.

00:26:21.900 | If you remember, what is that?

00:26:23.540 | That is this function here.

00:26:26.160 | So we pass each AI message individually to this function, and that will execute the tool

00:26:33.520 | for us, and then return us that observation from the tool.

00:26:37.760 | Okay.

00:26:39.180 | So that is what you see happening here.

00:26:42.800 | But this is an async method.

00:26:44.720 | So typically, what you'd have to do is you'd have to do await execute tool, and we could

00:26:50.780 | do that.

00:26:51.180 | So we could do a, okay, let me, let me make this a little bigger for us.

00:26:55.140 | Okay.

00:26:56.220 | And so what we could do, for example, which might be a bit clearer, is you could do tool

00:27:01.500 | obs equals an empty list.

00:27:04.360 | And what you could do is you can say, for tool call, oops, in tool calls, the tool observation

00:27:13.400 | is we're going to append execute tool call, which would have to be in await.

00:27:18.980 | So we'd actually put the await in there.

00:27:21.060 | And what this would do is actually the exact same thing as what we're doing here.

00:27:25.320 | The difference being that we're doing this tool by tool.

00:27:29.520 | Okay.

00:27:30.080 | So we are, we're executing async here, but we're doing them sequentially.

00:27:36.160 | Whereas what we can do, which is better, is we can use async.io gather.

00:27:40.560 | So what this does is gathers all those coroutines, and then we await them all at the same time

00:27:47.360 | to run them all asynchronously.

00:27:48.900 | They all begin at the same time, or almost exactly at the same time.

00:27:53.360 | And we get those responses kind of in parallel, but of course it's async.

00:27:58.500 | So it's not fully in parallel, but practically in parallel.

00:28:02.720 | Cool.

00:28:03.860 | So we have that.

00:28:05.200 | And then that, okay, we get all of our tool observations from that.

00:28:09.120 | So that's all of our tool messages.

00:28:10.740 | And then one interesting thing here is if we, let's say we have all of our AI messages

00:28:17.480 | with all of our tool calls, and we just append all of those to our agent scratch pad.

00:28:22.560 | All right.

00:28:23.120 | So let's say here, we're just like, oh, okay.

00:28:25.260 | Agent scratch pad.

00:28:26.360 | Extend.

00:28:28.760 | And then we would just have, okay, we'd have our tool calls.

00:28:32.300 | And then we do agent scratch pad, extend tool ops.

00:28:38.280 | All right, so what is happening here is this would essentially give us something that looks like this.

00:28:46.260 | So we'd have our AI message, say, I'm just going to put, okay, we'll just put tool call IDs in here

00:28:53.760 | to simplify it a little bit.

00:28:54.960 | This would be tool call ID A.

00:28:57.660 | Then we would have AI message, tool call ID B.

00:29:03.060 | Then we'd have tool message.

00:29:05.840 | Let's just remove this content field.

00:29:09.580 | I don't want that.

00:29:10.360 | And tool message, tool call ID B, right?

00:29:15.000 | So it would look something like this.

00:29:16.420 | So the order is, the tool message is not following the AI message, which you would think, okay,

00:29:23.080 | we have this tool call ID, that's probably fine.

00:29:25.000 | But actually, when we're running this, if you add these to your agent scratch pad in this order,

00:29:30.040 | what you'll see is your response just hangs, like nothing happens when you come through to your second

00:29:38.620 | iteration of your agent call.

00:29:40.700 | So actually, what you need to do is these need to be sorted so that they are actually in order.

00:29:47.460 | And it doesn't, actually, it doesn't necessarily matter which order in terms of like A or B or C or whatever you use.

00:29:53.720 | So you could have this order.

00:29:55.220 | We have AI message, tool message, AI message, tool message, just as long as you have your tool call IDs are both together.

00:30:01.580 | Or you could, you know, invert this, for example, right?

00:30:05.020 | So you could have this, right?

00:30:08.240 | And that will work as well.

00:30:10.140 | It's essentially just as long as you have your AI message followed by your tool message.

00:30:14.940 | And both of those are sharing that tool call ID.

00:30:17.480 | You need to make sure you have that order.

00:30:19.260 | Okay.

00:30:20.380 | So that, of course, would not happen if we do this.

00:30:24.100 | And instead, what we need to do is something like this.

00:30:28.980 | Okay.

00:30:29.560 | So if I made this a little easier to read.

00:30:32.800 | Okay.

00:30:33.800 | So we're taking the tool call ID.

00:30:35.520 | We are pointing it to the tool observation.

00:30:39.260 | And we're doing that for every tool call and tool observation within like a zip of those.

00:30:44.120 | Okay.

00:30:45.180 | Then what we're saying is for each tool call within our tool calls, we are extending our agent scratchpad with that tool call followed by the tool observation message, which is the tool message.

00:30:59.240 | So this would be our, this is the AI message.

00:31:03.060 | And that is the tool messages down there.

00:31:05.920 | Okay.

00:31:06.660 | So that is why it's happening and that is how we get this correct order, which will run.

00:31:11.620 | Otherwise, things will not run.

00:31:14.580 | So that's important to be aware of.

00:31:17.140 | Okay.

00:31:17.620 | Now we're, we're almost done.

00:31:19.480 | I know there's, we've just been through quite a lot.

00:31:21.500 | So we continue, we increment our count as we were doing before.

00:31:25.360 | And then we need to check for the final answer tool.

00:31:27.720 | Okay.

00:31:28.320 | And because we're running these tools in parallel.

00:31:30.680 | Okay.

00:31:31.480 | Because we're allowing multiple tool calls in one step.

00:31:34.620 | We can't just look at the most recent tool and look if it is, it has the name final answer.

00:31:39.520 | Instead, we need to iterate through all of our tool calls and check if any of them have the name final answer.

00:31:44.680 | If they do, we say, okay, we extract that final answer call.

00:31:48.920 | We extract the final answer as well.

00:31:50.400 | So this is the direct text content.

00:31:52.840 | And we say, okay, we have found the final answer.

00:31:56.160 | So this will be set to true.

00:31:57.420 | Okay.

00:31:58.260 | Which should happen every time.

00:31:59.820 | But let's say if our agent gets stuck in a loop of calling multiple tools, this might not happen before we break based on the max iterations here.

00:32:11.680 | So we might end up breaking based on max iterations rather than we found a final answer.

00:32:16.400 | Okay.

00:32:17.280 | So that can happen.

00:32:18.640 | So anyway, if we find that final answer, we break out of this for loop here.

00:32:24.000 | And then of course, we do need to break out of our wow loop, which is here.

00:32:28.440 | So we say, if we found the final answer, break.

00:32:31.120 | Okay.

00:32:32.040 | Cool.

00:32:32.800 | So we have that.

00:32:33.840 | Finally, after all of that.

00:32:37.680 | So this is our, you know, we've executed our tool, our agent steps and iterations has processed.

00:32:45.640 | We've been through those.

00:32:46.680 | Finally, we come down to here where we say, okay, we're going to add that final output to our chat history.

00:32:53.700 | So this is just going to be the text content.

00:32:57.060 | All right.

00:32:57.560 | So this here, get direct answer.

00:32:59.960 | But then what we do is we return the full final answer call.

00:33:04.620 | The full final answer call is basically this here.

00:33:07.060 | All right.

00:33:07.500 | So this answer and tools used, but of course populated.

00:33:11.740 | So we're saying here that if we have a final answer, okay, if we have that, we're going to return the final answer call, which was generated by our LLM.

00:33:22.280 | Otherwise, we're going to return this one.

00:33:24.900 | So this is in the scenario that maybe the agent got caught in the loop and just kept iterating.

00:33:29.720 | If that happens, we'll say it will come back with, okay, no answer found.

00:33:34.360 | And it will just return.

00:33:35.420 | Okay.

00:33:35.720 | We didn't use any tools, which is not technically true, but it's, this is like a exception handling event.

00:33:43.080 | So it ideally shouldn't happen, but it's not really a big deal if we're saying, okay, there were no tools used in my opinion.

00:33:50.880 | Anyway, cool.

00:33:52.020 | So we have all of that and yeah, we just, we initialize our agent executor and then, I mean, that, that is our agent execution code.

00:34:03.480 | The one last thing we want to go through is the SERP API tool, which we will do in a moment.

00:34:08.700 | Okay.

00:34:09.560 | So SERP API, let's see what, let's see how we build our SERP API tool.

00:34:18.080 | Okay.

00:34:18.360 | So we'll start with the synchronous SERP API.

00:34:23.680 | Now, the reason we're starting with this is that it's actually, it's just a bit simpler.

00:34:28.000 | So I'll show you this quickly before we move on to the async implementation, which is what we're using within our app.

00:34:34.220 | So we want to get our SERP API API key.

00:34:37.640 | So I'll run that and we just enter it at the top there.

00:34:41.880 | And this will run.

00:34:44.840 | So we're going to use the SERP API SDK first.

00:34:49.460 | We're importing Google search, and these are the input parameters.

00:34:52.340 | So we have our API key we're using, we say once you use Google, we, our question is our query.

00:34:57.740 | So Q for query.

00:34:59.400 | We're searching for the latest news in the world and it will return quite a lot of stuff.

00:35:05.600 | You can see there's a ton of stuff in there, right?

00:35:08.680 | Now, what we want is contained within this organic results key.

00:35:14.520 | So we can run that and we'll see, okay, it's talking about, you know, various things.

00:35:20.620 | Pretty recent stuff at the moment.

00:35:22.800 | So we can tell, okay, that is, that is in fact working.

00:35:26.660 | Now this is quite messy.

00:35:28.620 | So what I would like to do first is just clean that up a little bit.

00:35:31.820 | So we define this article base model, which is Pydantic.

00:35:35.960 | And we're saying, okay, from a set of results.

00:35:39.680 | Okay.

00:35:40.480 | So we're going to iterate through each of these.

00:35:43.060 | We're going to extract the title, source, link, and the snippet.

00:35:47.440 | So you can see title, source, link, and snippet here.

00:35:53.740 | Okay.

00:35:55.920 | So that's all useful.

00:35:57.420 | We'll run that.

00:35:58.600 | And what we do is we go through each of the results in organic results.

00:36:03.720 | And we just load them into our article using this class method here.

00:36:07.580 | And then we can see, okay, let's have a look at what those look like.

00:36:13.240 | It's much nicer.

00:36:13.980 | Okay.

00:36:15.200 | We get this nicely formatted object here.

00:36:19.140 | Cool.

00:36:20.180 | That's great.

00:36:21.020 | Now, all of this, what we just did here.

00:36:24.720 | So this is using sub-APIs SDK, which is great, super easy to use.

00:36:29.260 | The problem is that they don't offer a async SDK, which is a shame,

00:36:34.420 | but it's not that hard for us to set up ourselves.

00:36:38.100 | So typically with asynchronous requests, what we can use is the AIOHttp library.

00:36:45.720 | It's, well, you can see what we're doing here.

00:36:49.220 | So this is equivalent to requests.get.

00:36:55.820 | Okay.

00:36:56.100 | That's essentially what we're doing here.

00:36:58.100 | And the equivalent is literally this.

00:37:01.000 | Okay.

00:37:02.080 | So this is the equivalent using requests that we are running here,

00:37:06.980 | but we're using async code.

00:37:09.280 | So we're using AIOHttp client session and then session.get, okay, with this async with here.

00:37:17.680 | And then we just await our response.

00:37:19.380 | So this is all, yeah, this is what we do rather than this to make our code async.

00:37:26.140 | So it's really simple.

00:37:27.340 | And then the output that we get is exactly the same, right?

00:37:30.320 | So we still get this exact same output.

00:37:32.060 | So that means, of course, that we can use that articles method like this in the exact same way.

00:37:39.760 | And we get, we get the same result.

00:37:42.140 | There's no need to make this article from sub-API result async because again,

00:37:48.360 | like this, this bit of code here is fully local.

00:37:51.400 | It's just our Python running everything.

00:37:53.820 | So this does not need to be async.

00:37:57.300 | Okay.

00:37:58.240 | And we can see that we get literally the exact same result there.

00:38:02.220 | So with that, we have everything that we would need to build a fully asynchronous sub-API tool,

00:38:08.840 | which is exactly what we do here for Langchain.

00:38:11.380 | So we import those tools.

00:38:12.880 | And I mean, there's nothing, is there anything different here?

00:38:15.720 | No.

00:38:16.540 | This is exactly what we just said.

00:38:18.800 | But I will run this because I would like to show you very quickly this.

00:38:23.860 | Okay.

00:38:24.840 | So this is how we were initially calling our tools in previous chapters because we were okay

00:38:31.400 | mostly with using the, the synchronous tools.

00:38:34.680 | However, you can see that the func here is just empty.

00:38:40.240 | All right.

00:38:41.400 | So if I do type, just a non-type, that is because, well, this is an async function.

00:38:48.040 | Okay.

00:38:48.280 | It's an async tool.

00:38:51.420 | Sorry.

00:38:52.000 | So it was defined with async here.

00:38:54.760 | And what happens when you do that is you get this coroutine object.

00:39:00.140 | So rather than func, which is, it isn't here, you get that coroutine.

00:39:06.340 | If we then modified this, which would be kind of, okay, let's just remove all the asyncs here

00:39:12.540 | and the await.

00:39:14.540 | If we modify that like so, and then we look at the cert API structure tool, we go across,

00:39:24.180 | we see that we now get that func.

00:39:25.780 | Okay.

00:39:26.200 | So that is, that is just the difference between an async structure tool versus a sync structure

00:39:32.300 | tool via course on async.

00:39:33.840 | Okay.

00:39:37.040 | Now we have coroutine again.

00:39:39.180 | So important to be aware of that.

00:39:41.440 | And of course we, we run using the cert API coroutine.

00:39:48.240 | So that is, that's how we build the cert API tool.

00:39:54.640 | And there's nothing, I mean, that is exactly what we did here.

00:39:56.980 | So I don't need to, I don't think we need to go through that any further.

00:40:00.020 | So yeah, I think that is basically all of our code behind this API.

00:40:06.520 | With all of that, we can then go ahead.

00:40:09.560 | So we have our API running already.

00:40:11.380 | Let's go ahead and actually run also our front end.

00:40:16.000 | So we're going to go to documents, Aurelio, line chain course.

00:40:20.700 | And then we want to go to chapters 09 capstone app, and you will need to have NPM installed.

00:40:28.700 | So to do that, what do we do?

00:40:30.420 | We can take a look at this answer.

00:40:32.480 | For example, this is probably what I would recommend.

00:40:35.640 | Okay.

00:40:36.060 | So I would run brew install node followed by brew install NPM.

00:40:39.880 | If you're on Mac, of course it's different.

00:40:41.660 | If you're on Linux or windows, once you have those, you can do NPM install.

00:40:46.120 | And this will just install all of the, oops, sorry, NPM install.

00:40:50.480 | And this will just install all of the node packages that we need.

00:40:55.340 | And then we can just run NPM run dev.

00:40:59.900 | Okay.

00:41:00.520 | And now we have our app running on Locos 3000.

00:41:04.100 | So we can come over to here, open that up, and we have our application.

00:41:09.960 | Can ignore this.

00:41:10.960 | So in here, we can begin just asking questions.

00:41:14.280 | Okay.

00:41:14.800 | So we can start with a quick question.

00:41:16.940 | What is five plus five?

00:41:18.700 | And we see, so we have our streaming happening here.

00:41:24.720 | It said the agent wants to use the add tool, and these are the input parameters to that add tool.

00:41:29.300 | And then we get the streamed response.

00:41:31.480 | So this is the final answer tool where we're outputting that answer key and value.

00:41:37.160 | And then here we're outputting that tools used key and value, which is just an array of the tools being used, which just functions add.

00:41:45.040 | So we have that.

00:41:46.820 | Then let's ask another question.

00:41:49.100 | This time we'll trigger SERP API with tell me about the latest news in the world.

00:41:55.580 | Okay.

00:41:56.080 | So we can see us using SERP API and the query is latest world news, and then it comes down here and we actually get some citations here, which is kind of cool.

00:42:07.340 | So you can also come through to here.

00:42:09.300 | Okay.

00:42:11.040 | And it takes us through to here.

00:42:12.900 | So that's pretty cool.

00:42:14.200 | Unfortunately, I just lost my chat.

00:42:16.600 | So fine.

00:42:18.980 | Let me, I can ask that question again.

00:42:21.340 | Okay.

00:42:29.660 | We can see that tools use SERP API there.

00:42:32.100 | Now let's continue with the next question from our notebook, which is how cold is the NOSLA right now?

00:42:38.400 | What is five multiplied by five?

00:42:40.260 | And what do you get when multiplying those two numbers together?

00:42:43.500 | I'm just going to modify that to say in Celsius so that I can understand thinking.

00:42:50.600 | Okay.

00:42:50.740 | So for this one, we can see what did we get?

00:42:53.240 | So current temperature in NOSLAO, we got multiply five by five, which is our second question.

00:42:59.860 | And then we've also got subtract.

00:43:01.980 | Interesting that I don't know why it did that.

00:43:06.380 | It's kind of weird.

00:43:07.080 | So it decided to use, oh, ah, okay.

00:43:12.240 | So this is, okay.

00:43:13.460 | So then here it was, okay.

00:43:16.260 | That kind of makes sense.

00:43:17.880 | Does that make sense?

00:43:19.920 | Roughly.

00:43:20.360 | Okay.

00:43:20.560 | So I think the, the conversion for Fahrenheit Celsius is say like subtract 32.

00:43:26.400 | Okay.

00:43:28.060 | Yes.

00:43:28.540 | So to go from Fahrenheit to Celsius, you are doing basically Fahrenheit minus 32.

00:43:36.660 | And then you're multiplying by this number here, which the, I assume the AI did not, I roughly

00:43:43.720 | did.

00:43:43.980 | Okay.

00:43:44.220 | So subtracting 36 by 32 would have given us four and it gave us approximately two.

00:43:49.220 | So if you think, okay, multiply by this, it's practically multiplying by 0.5.

00:43:54.780 | So halving the value and that would give us roughly two degrees.

00:43:58.860 | So that's what this was doing here.

00:44:01.140 | Kind of interesting.

00:44:02.240 | Okay, cool.

00:44:03.260 | So we've gone through, we have seen how to build a fully fledged chat application using what we've

00:44:12.980 | learned throughout the course.

00:44:14.600 | And we've built quite a lot.

00:44:16.360 | If you think about this application, you're getting the real time updates on what tools

00:44:22.820 | are being used, the parameters being input to those tools.

00:44:25.280 | And then that is all being returned in a streamed output and even in a structured output for your

00:44:31.980 | final answer, including the answer and the tools that we use.

00:44:35.040 | So of course, you know, what we built here is fairly limited, but it's super easy to extend

00:44:41.700 | this.

00:44:42.020 | Like you could, maybe something that you might want to go and do is take what we've built

00:44:46.780 | here, like fork this application and just go and add different tools to it and see what

00:44:52.040 | happens.

00:44:52.520 | Because this is very extensible.

00:44:54.960 | You can do a lot with it, but yeah, that is the end of the course.

00:44:59.540 | Of course, this is just the beginning of whatever it is you're wanting to learn or build with

00:45:07.140 | AI, treat this as the beginning and just go out and find all the other cool, interesting

00:45:12.720 | stuff that you can go and build.

00:45:14.880 | So I hope this course has been useful, informative, and gives you an advantage in whatever it is you're

00:45:24.480 | going out and going out of this build.

00:45:25.300 | So thank you very much for watching and taking the course and sticking through right to the

00:45:30.520 | end.

00:45:30.820 | I know it's pretty long, so I appreciate it a lot and I hope you get a lot out of it.

00:45:37.200 | Thanks.

00:45:38.180 | Bye.

00:45:43.060 | Bye.

End-to-end AI Agent Project with LangChain | Full Walkthrough

Chapters