LangChain Agents Deep Dive with GPT 3.5

00:00:00.000 | Larger language models are incredibly powerful,

00:00:02.720 | as we've seen,

00:00:03.880 | but they lack some of the abilities

00:00:07.480 | that even the dumbest computer programs

00:00:09.840 | can handle with ease.

00:00:12.280 | Logic, calculations, and search

00:00:15.160 | are just a few examples

00:00:17.800 | of where large language models fail

00:00:20.160 | and really dumb computer programs

00:00:23.320 | can actually perform very well.

00:00:25.140 | We've been using computers

00:00:26.200 | to solve incredibly complex calculations

00:00:28.720 | for a very long time.

00:00:30.480 | Yet, if we ask GPT-4 to tell us the answer

00:00:34.240 | to what is 4.1 multiplied by 7.9,

00:00:37.840 | it actually fails.

00:00:39.520 | Isn't it fascinating

00:00:40.360 | that a simple calculator program can do this,

00:00:42.560 | but what is probably the most sophisticated AI program

00:00:46.940 | in the world right now

00:00:48.000 | that is accessible by us cannot?

00:00:50.760 | And that's not all.

00:00:51.720 | If I ask GPT-4,

00:00:53.840 | my somewhat overused example by now,

00:00:57.660 | of how do I use the LLM chain in LANG chain,

00:01:01.240 | it struggles again.

00:01:03.080 | It's true that LANG chain was a blockchain project.

00:01:06.280 | Yet, there didn't seem to be any LLM chain component

00:01:10.320 | nor LANG tokens.

00:01:12.680 | These are both hallucinations.

00:01:14.720 | Granted, the reason that GPT-4

00:01:17.080 | is unable to tell us about these

00:01:19.360 | is because it hasn't heard of LANG chain,

00:01:23.720 | or at least not LANG chain I'm referring to.

00:01:25.560 | That is because GPT-4 has no connection

00:01:28.260 | to the outside world.

00:01:29.900 | The only part of the outside world

00:01:32.180 | that GPT-4 has seen is what it saw during its training.

00:01:37.180 | And the training data cutoff of GPT-4

00:01:40.540 | appears to be around September, 2021.

00:01:43.580 | With what seem to be major weaknesses

00:01:46.420 | in today's large language models,

00:01:48.620 | we need to find solutions.

00:01:50.600 | One suite of potential solutions to these problems

00:01:54.900 | comes in the form of agents.

00:01:57.500 | These agents don't just solve many of the problems

00:02:00.540 | we saw above, but actually many others as well.

00:02:04.580 | In fact, by using agents,

00:02:06.340 | we actually have an almost unlimited upside

00:02:10.380 | in the potential of what we can do

00:02:12.740 | with large language models.

00:02:14.660 | So we're gonna learn what agents are

00:02:17.220 | and how we can use them within LANG chain library

00:02:20.700 | to superpower our large language models.

00:02:23.540 | What we'll do is I'll quickly go through

00:02:26.260 | an introduction to agents in LANG chain,

00:02:28.660 | and then I'll hand it over to Francisco

00:02:30.380 | for more of a deep dive into agents in LANG chain.

00:02:33.660 | So let's jump straight into it.

00:02:35.260 | We can think of agents as enabling tools

00:02:39.500 | for large language models.

00:02:41.300 | Kind of like how a human like I

00:02:44.100 | would use a calculator for maths,

00:02:46.480 | or I might perform a Google search for information.

00:02:49.540 | Agents allow a large language model

00:02:52.540 | to do the same thing.

00:02:54.460 | Using agents, a large language model

00:02:56.580 | can write and execute Python code.

00:02:59.340 | It can perform Google search,

00:03:01.340 | and it can even perform SQL queries.

00:03:04.260 | Let's start with a very simple example of this.

00:03:08.620 | What we're going to do is build a calculator agent

00:03:12.880 | that can also handle some general knowledge queries.

00:03:17.180 | Now to use agents in LANG chain,

00:03:19.340 | we need three key components.

00:03:22.420 | That is a large language model

00:03:24.540 | or multiple large language models,

00:03:26.380 | a tool that we will be interacting with,

00:03:29.420 | and an agent to control that interaction.

00:03:32.940 | Let's start by installing LANG chain

00:03:36.380 | and initializing our large language model.

00:03:39.500 | So we're in Colab here.

00:03:41.180 | There will be a link to this notebook

00:03:43.020 | somewhere near the top of the video right now.

00:03:45.020 | And we do a pip install LANG chain and OpenAI,

00:03:49.180 | because we're going to be using

00:03:50.500 | OpenAI's large language models here.

00:03:53.140 | But you can replace this.

00:03:54.220 | You can use Cohere, you can use Hug and Face models.

00:03:58.900 | And I'm not sure if it's implemented yet,

00:04:01.700 | but I'm sure pretty soon

00:04:03.540 | there'll probably be Google Palm models in here as well.

00:04:06.660 | Okay, cool.

00:04:07.780 | So we need to first start by initializing

00:04:11.660 | our large language model.

00:04:13.380 | We're using text of ng003 here.

00:04:16.060 | You can replace that obviously

00:04:17.460 | with more recent models as well.

00:04:19.260 | Okay, I've already run this with my API key in there.

00:04:22.340 | So move on.

00:04:24.380 | Next, we want to initialize

00:04:27.620 | what is going to be a calculator tool

00:04:29.300 | using the LLM math chain.

00:04:32.300 | So the LLM math chain,

00:04:33.460 | if you've watched previous videos in this series,

00:04:36.460 | you've probably seen it.

00:04:38.940 | It is basically a large language model

00:04:42.380 | that will call to Python

00:04:45.820 | with some code for performing a calculation.

00:04:49.700 | Okay, that's what this is here, right?

00:04:52.580 | And what we're doing here

00:04:54.140 | is we're formatting it into what we call a tool.

00:04:57.460 | Now a tool, it is simply the run,

00:05:02.060 | so the functionality of this LLM math chain,

00:05:06.180 | plus a name for that chain, for that function,

00:05:10.700 | and also a description.

00:05:11.940 | Now this description is useful.

00:05:13.700 | This is essentially the prompt.

00:05:16.540 | This will be included in the prompt

00:05:18.580 | for the agent executor large language model.

00:05:23.340 | And we're basically gonna say,

00:05:25.820 | look, this is a tool that you can use, right?

00:05:28.580 | It's called calculator,

00:05:30.100 | and you should use it when you think this is relevant.

00:05:34.620 | Okay, so we're saying it's useful

00:05:36.340 | for when you need to answer questions about math, right?

00:05:39.500 | That's how the large language model

00:05:41.820 | will know when to use this tool.

00:05:44.260 | Now, when we're passing those tools

00:05:45.980 | to the large language model,

00:05:47.020 | we will actually pass a list of tools, okay?

00:05:50.780 | So for now, there's just one item in that list,

00:05:53.860 | but we can add multiple, as we'll see very soon.

00:05:56.940 | Now, when we're using these tools,

00:05:59.620 | we need to initialize an agent that will contain them.

00:06:03.900 | It's gonna contain like a base large language model,

00:06:07.100 | which is gonna control.

00:06:08.620 | It's kind of like the orchestrator,

00:06:10.700 | the top level large language model.

00:06:13.140 | And there are also these different types of agents, okay?

00:06:17.940 | So this one is a zero-shot react description agent.

00:06:22.220 | That means a few things.

00:06:24.340 | So zero-shot part of that means

00:06:27.220 | that the agent is currently looking

00:06:29.140 | at just the current prompt.

00:06:31.780 | It doesn't have any memory, right?

00:06:34.020 | So if you've got a chatbot,

00:06:35.860 | you probably wouldn't necessarily use this one,

00:06:38.460 | but we'll cover some alternatives later on.

00:06:41.500 | So for it being zero-shot, it doesn't have that memory

00:06:45.300 | and it's focusing on the current action only.

00:06:48.380 | React is like an agent framework.

00:06:53.500 | The idea behind that is that we,

00:06:57.100 | or the agent will reason about

00:07:00.060 | whatever prompt has been assigned to it.

00:07:02.500 | It will decide on an action to take,

00:07:04.580 | and then it will pass some information to that action item.

00:07:08.660 | The action item is the tools that we're using here.

00:07:12.020 | And then it will get a response from the action item,

00:07:15.260 | that tool, and then repeat this process

00:07:18.020 | of reasoning and action.

00:07:20.500 | We're not gonna go too into detail on that here.

00:07:23.860 | That kind of deserves its own video, I think.

00:07:26.820 | And as for which action to take,

00:07:29.460 | the agent is basing that on the description

00:07:33.380 | of each of the tools that we have.

00:07:35.220 | Okay, so let's initialize that.

00:07:39.420 | One other important thing here is these max iterations.

00:07:42.580 | So I kind of mentioned just now,

00:07:44.700 | it can go through a loop of reason and action.

00:07:48.620 | We add max iterations so that it doesn't just go

00:07:51.540 | into an infinite loop of reasoning and making actions

00:07:54.020 | and so on and so on, just forever, right?

00:07:56.460 | So we kind of limit it to,

00:07:58.260 | you can do three of these loops and that cuts off.

00:08:01.820 | Okay, cool.

00:08:02.660 | So let's ask some questions.

00:08:05.820 | So the zero shot agent,

00:08:07.820 | we can see the thought process of the agent here, right?

00:08:12.260 | So saying, huh, okay, I need to calculate this expression.

00:08:16.420 | So what is the action I need to take?

00:08:18.460 | I need to use a calculator.

00:08:20.460 | What's the input I'm going to pass to this tool?

00:08:24.540 | It is this here, right?

00:08:26.020 | This is our, what we've asked it to calculate here.

00:08:29.460 | Okay, and then the response from that agent

00:08:33.540 | from the LLM math chain is this here.

00:08:36.260 | Okay, this number.

00:08:37.860 | Okay, and it says the next thought it has is,

00:08:40.580 | I now know the final answer.

00:08:42.580 | It is this, right?

00:08:45.100 | And then we finished the chain

00:08:46.820 | or the agent execution chain.

00:08:49.420 | So the input was this and the output we get is this, right?

00:08:55.060 | Now the end user doesn't need to see all of this.

00:08:57.820 | The reason that we're seeing all of this

00:08:59.700 | is because we set verbose equal to true.

00:09:02.340 | Set this to false, we'll just get the output, right?

00:09:06.700 | But obviously we're developing this.

00:09:09.260 | We want to see what is actually happening.

00:09:11.660 | Now, what you can see, I've already run this.

00:09:15.020 | The output, the actual value here

00:09:19.540 | is in fact this number here.

00:09:21.220 | Okay, so that's much better

00:09:23.020 | than when we're using large language models typically,

00:09:26.900 | and they just give us like a completely wrong answer.

00:09:30.220 | All right, so that looks good.

00:09:31.660 | Let's try something else.

00:09:33.540 | Now, here we're kind of using natural language

00:09:37.140 | to kind of define the calculation that needs to happen.

00:09:42.140 | We say Mary has four apples.

00:09:44.260 | Giorgio brings two and a half apple boxes,

00:09:47.380 | where an apple box contains eight apples.

00:09:49.620 | How many apples do we have, right?

00:09:52.060 | It's just kind of simple logic and calculation here.

00:09:57.220 | But naturally it needs to use both the calculator chain,

00:10:01.900 | the LLM math chain, and also its own sort of LLM ability

00:10:06.540 | to reason about what is happening here.

00:10:09.060 | So it starts by I need to figure out

00:10:11.700 | how many apples are in the boxes, right?

00:10:14.060 | So it's doing this step by step,

00:10:15.500 | which I think is really cool.

00:10:17.300 | So it says there's two and a half boxes,

00:10:19.780 | and each box contains eight apples.

00:10:21.900 | So there's 20 apples in the boxes.

00:10:24.700 | And it says I need to add the apples Mary has

00:10:27.460 | to the apples in the boxes.

00:10:29.380 | Okay, we do calculate again, it's four plus 20, 24.

00:10:33.980 | And now it's like, I now have the final answer.

00:10:37.180 | We have 24 apples.

00:10:39.180 | Cool, so we see the input and we see the output.

00:10:42.580 | We have 24 apples, great.

00:10:44.700 | All of this so far that we've done here,

00:10:47.100 | a LLM math chain could probably do,

00:10:51.940 | it could definitely do this.

00:10:53.740 | And there's probably a good chance it could do this.

00:10:57.140 | It wouldn't go through the multiple reasoning steps,

00:10:59.980 | but it might be able to do this in a single step.

00:11:03.060 | So with that in mind,

00:11:05.620 | why don't we just use the LLM math chain?

00:11:08.420 | Well, what if we ask this, what is the capital of Norway?

00:11:12.180 | An LLM math chain by itself

00:11:17.220 | is not going to be able to answer this

00:11:18.580 | because it's going to go to the calculator, right?

00:11:22.620 | So we say, I need to look at the answer.

00:11:25.540 | Okay, so right now it's, what is it doing?

00:11:27.580 | This is something new.

00:11:29.300 | That's because it's seeing what is the capital of Norway

00:11:32.860 | in the base or the agent executes prompt.

00:11:36.140 | It's saying to answer this prompt,

00:11:38.460 | you need to use one of the following tools,

00:11:41.420 | but the only tool that we're given is a calculator.

00:11:43.580 | The LLM knows that a calculator

00:11:45.940 | isn't going to give it the answer

00:11:47.620 | to what is the capital of Norway.

00:11:49.020 | So it's actually hallucinating

00:11:51.300 | and trying to find another tool that it believes would help.

00:11:55.420 | And a lookup tool would help here,

00:11:58.300 | but it doesn't actually have access to a lookup tool.

00:12:00.780 | It's just kind of imagining that it does,

00:12:02.900 | it's hallucinating that it does.

00:12:04.460 | Okay, so it's like, okay,

00:12:06.540 | the only tool I actually have here is a calculator.

00:12:09.860 | And obviously you pass this to the calculator tool

00:12:11.900 | and the answer is it's not going to work, right?

00:12:16.220 | So it actually just says action input NA.

00:12:19.740 | Like I can't, I don't even know

00:12:21.180 | what to give to this calculator.

00:12:23.460 | And then we'll get a value error.

00:12:24.980 | Okay, right, that's fine.

00:12:26.620 | It's kind of expected,

00:12:27.820 | we only have a calculator for this agent to use.

00:12:31.140 | It's not going to be able to answer this question.

00:12:32.900 | But what if we do want it to be able to answer

00:12:34.900 | general knowledge questions

00:12:36.340 | as well as perform calculations?

00:12:38.780 | Well, okay, in that case,

00:12:41.380 | what we need to do is add another tool

00:12:43.700 | to the agent's toolbox.

00:12:47.140 | What we're going to do is just initialize

00:12:49.940 | a simple LLM chain, right?

00:12:52.340 | So to answer what is the capital of Norway,

00:12:56.020 | a simple LLM can do that, right?

00:12:58.540 | So all we're going to do is create this LLM chain.

00:13:02.260 | We're not doing it, we're not really doing anything here.

00:13:04.100 | We just got a prompt template,

00:13:06.060 | which is just going to pass the query

00:13:08.580 | straight to the LLM chain.

00:13:11.460 | Okay, and we're going to call this one the language model.

00:13:15.940 | And we're going to say, use this tool

00:13:17.620 | for general purpose queries and logic.

00:13:20.260 | Okay, cool.

00:13:21.780 | And we just add the new tool to our tools list, like this.

00:13:25.940 | Right, so now we've got two tools in that tool list.

00:13:29.980 | And we just reinitialize the agent with our two tools,

00:13:33.060 | the calculator and the language model.

00:13:35.900 | Okay, now we say, what is the capital of Norway?

00:13:39.700 | And it should say, ah, okay,

00:13:41.980 | I can refer to the language model for this.

00:13:44.900 | And we say, what is the capital of Norway?

00:13:47.420 | The capital of Norway is Oslo.

00:13:48.940 | Okay, so we get the correct answer this time.

00:13:51.180 | And yeah, I now know the final answer,

00:13:53.740 | capital of Norway is Oslo.

00:13:55.180 | Now we can ask it a math question.

00:13:56.820 | Okay, what is this?

00:13:58.580 | Okay, so it can answer math questions as well now.

00:14:01.340 | So all of a sudden our agent is able to do

00:14:06.100 | two completely different things

00:14:08.180 | that we would need separate LLM chains or chains for,

00:14:12.940 | which I think is really cool.

00:14:14.340 | And these are two super simple examples.

00:14:17.340 | Francisco is in a moment going to go through,

00:14:20.500 | I think what far more interesting examples

00:14:22.940 | and you'll definitely see more

00:14:25.460 | of how we can use these agents in LLM chain.

00:14:29.740 | But before I finish,

00:14:30.980 | there is just one thing I should point out

00:14:32.980 | because I kind of missed it for the sake of simplicity,

00:14:35.620 | but it wouldn't be fair for me to not mention it,

00:14:38.220 | is up here we defined our math tool here, right?

00:14:44.220 | In reality, there is already a math tool

00:14:48.940 | and a set of pre-built tools that come with LLM chain.

00:14:53.940 | And to use those, we would write something like

00:14:56.940 | from LLM chain, agents, import load tools.

00:15:01.940 | And from there, we would just do tools, load tools.

00:15:11.620 | And then here we actually pass a list of the tools

00:15:14.500 | that we would like to load.

00:15:15.740 | So if we just want the LLM math chain again,

00:15:18.420 | we'd write that.

00:15:19.260 | Now the LLM math chain does require LLM.

00:15:22.740 | So we also need to pass LLM in there as well, right?

00:15:27.740 | So let me, let's run this again.

00:15:31.980 | If we look at tools, name and tools description,

00:15:37.820 | (keyboard clicking)

00:15:40.740 | we see we have calculator,

00:15:43.780 | useful for when you need to answer questions about math.

00:15:46.620 | And if you print out a full tool list,

00:15:48.380 | it's going to show you a ton of other things as well,

00:15:50.540 | including your open AI API key,

00:15:54.380 | which is very useful for when you're trying to show people

00:15:58.580 | what is in there.

00:15:59.700 | So if we try the same again with this,

00:16:02.740 | we can, let's just copy.

00:16:05.740 | And we'll see that we actually get the same thing,

00:16:07.380 | we have the same tool name, we have the same description.

00:16:09.460 | And if you print out full thing,

00:16:10.860 | you'll also see all of the parameters

00:16:13.300 | that define the tool and chain being used.

00:16:17.500 | So these two bits of code here do the exact same thing,

00:16:21.260 | right?

00:16:22.100 | I'm just saying that you can initialize a tool by yourself

00:16:24.780 | or you can use the pre-built tools as well.

00:16:27.500 | Now, I think I've talked for long enough.

00:16:30.980 | What we'll do is we'll pass it over to Francisco,

00:16:33.340 | who's going to take us through these tools and agents

00:16:36.820 | in a lot more detail.

00:16:38.060 | So over to Francisco.

00:16:40.620 | - Thanks James for that introduction.

00:16:42.460 | And now we will be diving into agents.

00:16:45.820 | Agents are arguably the most important

00:16:47.980 | building block in Lankchain.

00:16:48.940 | So it's really important that we get them right.

00:16:51.780 | And we'll be seeing a few examples

00:16:53.900 | to really understand how they work.

00:16:55.700 | As always, we need to initialize our OpenAI LLM

00:17:01.820 | and we will get into the definition here.

00:17:04.660 | So the official definition is that agents use LLMs

00:17:08.540 | to determine which actions to take and in what order.

00:17:11.380 | And an action can be using a tool or returning to the user.

00:17:16.220 | But if we think about it more intuitively,

00:17:20.380 | what an agent does is it applies reasoning

00:17:23.580 | to use several tools collectively and in unison

00:17:28.580 | to build an answer for the user.

00:17:31.660 | So this is the key behind agents.

00:17:34.140 | It's the agents are reasoning about what tools

00:17:38.060 | they need to use and how they need to use them

00:17:41.140 | and how they need to combine their outputs

00:17:43.500 | to actually give a right answer.

00:17:45.740 | And this is a really, really powerful framework

00:17:49.060 | as we will see in the next examples.

00:17:52.220 | So now we will be creating a database.

00:17:54.660 | It's not really important to understand

00:17:56.220 | exactly what we're doing to create this database.

00:17:59.500 | The important thing here is that we're going

00:18:01.260 | to build an SQL database with one table,

00:18:04.580 | which is a table with stocks.

00:18:07.420 | And we will add some observations

00:18:09.180 | because we will want our agent

00:18:10.460 | to interact with these observations.

00:18:12.220 | Here we have a few stocks, two stocks actually,

00:18:15.220 | with the different prices in different date times.

00:18:18.340 | And the important part where we will be creating

00:18:22.340 | the tool for the agent to use.

00:18:24.860 | So here we create a chain which uses

00:18:28.700 | the database we just created.

00:18:30.820 | This is the engine we just created over here.

00:18:35.820 | And we will create a database chain from that engine.

00:18:40.900 | Now we will build a tool.

00:18:43.980 | And here, just a small definition of what a tool is.

00:18:47.980 | A tool is a function or a method that the agent can use

00:18:52.980 | when it thinks it is necessary.

00:18:55.740 | So how will the agent know if it's necessary?

00:18:58.860 | Well, we're giving it a description here.

00:19:00.460 | So we're telling the agent when it should be used.

00:19:03.300 | Here we're giving the agent the function it should run.

00:19:07.380 | And here's the name.

00:19:09.420 | So the agent will ask our chain

00:19:13.460 | for a question about stocks and prices.

00:19:16.620 | And our chain will answer that question

00:19:20.820 | using the data from the database.

00:19:23.700 | So just a few clarifications

00:19:27.460 | before we dive into our first agent type.

00:19:29.620 | We will see different agent types in this deep dive.

00:19:32.980 | But each one has three variables.

00:19:35.500 | So we have to define the tools.

00:19:37.260 | We want to give the agent the LLM

00:19:40.180 | because all agents are LLM based.

00:19:43.260 | And the agent type.

00:19:44.460 | So what type of agent we want to use.

00:19:47.900 | We will start with this agent

00:19:49.740 | which is a zero-shot React agent.

00:19:51.860 | And we will first need to initialize our agent.

00:19:57.540 | And this agent will be able to basically reason

00:20:00.980 | about our question, gather information from tools,

00:20:04.740 | and answer.

00:20:05.620 | That's what the zero-shot agent does.

00:20:09.060 | So here we will load one tool,

00:20:11.540 | which is the LLM math tool,

00:20:13.140 | which we already saw with James.

00:20:15.780 | And we will append the SQL tool we saw before.

00:20:21.460 | And as the name suggests here,

00:20:22.740 | we will use this agent to perform zero-shot tasks.

00:20:24.900 | So we will not have many interactions with this agent,

00:20:28.060 | but only one.

00:20:29.260 | Or at least we'll have different interactions,

00:20:31.700 | but they will be isolated from each other.

00:20:34.220 | And we will be asking questions

00:20:36.740 | that can be answered with the tools,

00:20:38.460 | and the agent will help us answer them.

00:20:40.860 | One note here, it's important to have in mind

00:20:44.180 | that we should always set the max iterations.

00:20:46.620 | So the max iterations with React agents

00:20:49.820 | basically means how many thoughts it can have

00:20:53.460 | about our question and about using tools.

00:20:57.860 | So basically what this agent does

00:21:00.100 | is it thinks about what it needs to do,

00:21:02.060 | and then it does it.

00:21:03.220 | And one of the actions it can do

00:21:04.580 | is refer to one of the tools.

00:21:06.820 | So just to avoid the agent getting into an infinite loop

00:21:10.780 | and using tools indefinitely,

00:21:12.900 | we should always set a max iterations

00:21:14.620 | that we're comfortable with.

00:21:16.100 | And depending on the use case, that might change,

00:21:18.580 | but it's something that is useful to take into account.

00:21:21.500 | Here, we will set max iterations to three.

00:21:23.820 | All right, so here we will create our agent,

00:21:28.140 | and we will ask it a very complex question,

00:21:30.300 | which is what is the multiplication of the ratio

00:21:32.220 | between stock prices for ABC and XYZ in two different dates?

00:21:36.300 | So it involves quite a lot of steps,

00:21:39.020 | and we will try to understand what the agent is doing here.

00:21:41.700 | So first it needs to compare the prices,

00:21:45.300 | or it knows it needs to compare these prices

00:21:47.660 | on two different days.

00:21:48.620 | It queries the SQL database for the 3rd and the 4th of January

00:21:54.100 | with the prices for ABC and XYZ.

00:21:57.020 | Here, we can see that the query is generated,

00:22:00.580 | and we get the results.

00:22:02.340 | And then the agent is getting the answer from this tool,

00:22:06.660 | which returns the actual data that was requested,

00:22:09.740 | which is this piece here.

00:22:12.460 | Now, the agent, with this information,

00:22:14.740 | is determining that these are the right prices,

00:22:19.700 | and now it needs to make a calculation.

00:22:23.300 | So how will it make this calculation with the calculator?

00:22:26.340 | Let's see what it's doing.

00:22:28.220 | It's calculating the ratio first

00:22:30.220 | of the two prices on each day,

00:22:32.780 | and then when it has the two ratios with the calculator,

00:22:35.620 | it's using the calculator another time

00:22:37.580 | to calculate the multiplication between the two ratios.

00:22:40.140 | So this framework is really powerful,

00:22:43.140 | because as I said before,

00:22:44.460 | it's combining reasoning with the tools,

00:22:46.860 | and in several steps, it's finding

00:22:49.220 | and converging to the right answer by using these tools.

00:22:51.900 | So this is the key takeaway here

00:22:54.460 | for this Zero-Shot React description agent.

00:22:58.020 | We can see the prompt, just as a quick pass here.

00:23:01.620 | What we're teaching it to do

00:23:03.500 | is we're sending it, obviously, the tools

00:23:06.460 | that it needs to use,

00:23:08.020 | and then we're asking it to use this question,

00:23:11.180 | thought, action, action, input, observation framework,

00:23:14.860 | and this framework can be repeated n times

00:23:18.900 | until it knows the final answer,

00:23:20.660 | or it reaches max iterations, which we referred to earlier.

00:23:25.140 | This is a very powerful framework.

00:23:26.980 | You can check it out in the Miracle paper,

00:23:29.700 | which we'll link below,

00:23:30.700 | but the important thing here to know

00:23:32.740 | is that it enables us to combine reasoning with tools,

00:23:36.980 | and this is the type of things

00:23:38.700 | that you can do with agents.

00:23:41.260 | The level of abstraction is much higher

00:23:42.940 | than using a tool in isolation.

00:23:44.580 | So basically, the LLM now has the ability to reason

00:23:48.940 | how to best use this tool,

00:23:50.900 | and basically, another thing that is important here

00:23:54.100 | is the agent's scratchpad within the prompt,

00:23:57.180 | and that is where every thought and action

00:23:59.780 | that the agent has already performed will be appended,

00:24:02.700 | and thus, the agent will know at each point in time

00:24:06.260 | what it has found out until that moment,

00:24:08.860 | and will be able to continue

00:24:10.580 | that thought process from there.

00:24:12.260 | So that's where the agent notes its previous thoughts

00:24:16.340 | during the thought process.

00:24:17.700 | Okay, so now we're ready for our second type of agent.

00:24:22.020 | This type of agent is really similar to the first one,

00:24:24.180 | so let's take a look.

00:24:26.020 | This is called conversational react,

00:24:27.820 | and this is basically the last agent, but with memory.

00:24:32.420 | So we can interact with it in several interactions,

00:24:36.260 | and ask it questions from things that we have already said.

00:24:39.740 | It is really useful to have a chatbot.

00:24:43.980 | Basically, it's the basis for chatbots in Langchain,

00:24:48.380 | so it's really useful.

00:24:51.140 | Again, we load the same tools here.

00:24:53.020 | We will add memory, as we said,

00:24:56.260 | so we will ask it a similar question,

00:24:58.260 | a bit different, a bit simpler,

00:25:00.060 | than we asked the previous agent.

00:25:01.180 | So we will ask it what the ratio of stock prices is

00:25:03.900 | on January the 1st for these two stocks.

00:25:06.180 | Let's see what it answers.

00:25:07.980 | So it's getting the stock data,

00:25:11.420 | and it calculated without the calculator,

00:25:14.060 | since it's a simple calculation,

00:25:16.740 | the actual ratio we need.

00:25:18.620 | So let's see the prompt here.

00:25:19.780 | So we will see the prompt is quite similar,

00:25:23.620 | but it includes this prefix, we could say,

00:25:27.620 | where we are telling the agent that it is an assistant,

00:25:31.380 | that it can assist with a wide range of tasks,

00:25:33.980 | and basically that it's here to assist

00:25:36.540 | and to answer questions or have conversations.

00:25:40.100 | So this is the main thing behind this agent.

00:25:43.380 | And also, we can see here we have a chat history variable,

00:25:47.660 | where we will be including the memory for this agent.

00:25:50.980 | So these are the main differences.

00:25:55.060 | If we ask it exactly the same question

00:25:57.500 | as we asked the previous agent, let's see what happens.

00:26:01.300 | It's using the chain, it's getting the data,

00:26:05.740 | but we are getting the right answer,

00:26:10.340 | but the action input for the chain

00:26:14.020 | was already to get the ratio.

00:26:16.940 | And then it didn't use the tool to multiply,

00:26:19.380 | it just multiplied on its own, right,

00:26:22.420 | without using a tool.

00:26:23.660 | So it made this decision of not using the calculator,

00:26:27.540 | whereas our previous agent had decided

00:26:29.940 | to use the calculator here.

00:26:31.780 | And these two agents sometimes don't behave exactly the same

00:26:35.700 | because the prompts are different.

00:26:37.660 | Perhaps for this agent, we're telling it

00:26:40.100 | that it can solve a wide range of tasks,

00:26:43.260 | so maybe it's getting the confidence

00:26:46.180 | to try out some math there.

00:26:47.860 | But in essence, what they're doing is quite similar,

00:26:52.460 | but this agent is including a conversational aspect

00:26:55.740 | and a memory aspect.

00:26:57.420 | All right, now we will see our final two agents.

00:27:00.980 | And this first one is called the React Docs Store agent,

00:27:03.660 | and it is made to interact with a document store.

00:27:07.220 | So let's say that we want to interact with Wikipedia,

00:27:10.900 | and we will need to send it two tools,

00:27:13.340 | one for searching articles, and the other one

00:27:16.100 | is for looking up specific terms in the article it found.

00:27:20.460 | And this is, if we think about it,

00:27:22.140 | what we do when we search a document store like Wikipedia.

00:27:24.860 | We search for an article that might have the answer

00:27:27.500 | to our question, and then we search within the article

00:27:30.500 | for the specific paragraph or snippet where the question is.

00:27:34.740 | So we will do exactly that.

00:27:37.220 | We will initialize this agent here,

00:27:38.820 | and we will run what were Archimedes' last words.

00:27:42.420 | Let's see.

00:27:43.820 | So it's entering the chain.

00:27:45.580 | It searches for Archimedes.

00:27:47.420 | That's interesting.

00:27:48.820 | And let's see.

00:27:50.340 | So it has its first observation.

00:27:52.860 | This observation is the first paragraph of the article.

00:27:56.060 | It didn't find the answer there, and then it

00:27:58.980 | looks up last words, and it finds

00:28:00.780 | the answer within that document.

00:28:03.380 | So that's basically how it works.

00:28:05.820 | We can see more about this in the prompt.

00:28:08.860 | There's a few examples.

00:28:09.900 | We will not print it because it's really large,

00:28:12.380 | but you can do so.

00:28:13.340 | And here is the paper for this agent as well.

00:28:16.820 | So yeah, this agent is basically useful

00:28:20.100 | for interacting with very large document source.

00:28:23.020 | And we can think here that this is basically the same thing

00:28:26.980 | that we would do.

00:28:28.220 | So the agent does it for us, basically.

00:28:31.300 | And finally, we have this last agent,

00:28:33.820 | which is called the self-ask with search.

00:28:36.780 | And basically, here the name is telling us a lot

00:28:40.260 | about how this agent works.

00:28:42.180 | It's asking questions, which are not the user question,

00:28:47.620 | obviously.

00:28:48.460 | But they are intermediate questions

00:28:50.060 | that it's asking to understand all the pieces of data

00:28:52.980 | it needs to build the last answer it

00:28:55.820 | needs to give to the user.

00:28:57.700 | So why will it ask all these follow-up questions

00:29:01.420 | on the user's original query?

00:29:03.660 | Because it will use a search engine

00:29:06.260 | to find the intermediate answers,

00:29:08.660 | and then build to the last final answer to the user.

00:29:12.660 | So as we can see here, we need to send it

00:29:15.020 | one tool, which is called the intermediate answer tool.

00:29:18.140 | And it will search.

00:29:19.980 | It must be some kind of search.

00:29:22.340 | Here, we are using Google Search.

00:29:24.100 | We will not showcase this functionality.

00:29:26.420 | For that, you need an API key by using this API wrapper.

00:29:30.820 | But we will initialize this agent,

00:29:33.660 | and we will see its prompt, which is basically enough

00:29:36.780 | to understand how it works.

00:29:39.620 | So here, we can see how it works.

00:29:42.420 | So the original question would be,

00:29:45.340 | who lived longer, Muhammad Ali or Alan Turing?

00:29:47.940 | And the follow-up questions needed here,

00:29:50.260 | it determines that it needs follow-up questions,

00:29:52.260 | and it starts asking them.

00:29:53.340 | So how old was Muhammad Ali when he died?

00:29:55.700 | Intermediate answer found by searching.

00:29:58.300 | Muhammad Ali was 74 years old.

00:30:00.900 | How old was Alan Turing?

00:30:02.740 | Alan Turing was 41 years old.

00:30:04.620 | So the final answer is Muhammad Ali.

00:30:06.620 | So this is the kind of logic that the agent follows.

00:30:09.620 | And to get these intermediate answers,

00:30:11.620 | it needs some kind of search tool.

00:30:14.140 | This is one option that LangChain provides.

00:30:16.500 | There might be others, but it needs to be able to search.

00:30:19.340 | And that's the main characteristic

00:30:21.220 | of this agent, which is that it gets intermediate answers

00:30:25.140 | to follow-up questions by searching.

00:30:28.700 | So you have the paper here also, if you want to dive deeper.

00:30:32.260 | And this is it.

00:30:34.540 | We're wrapping up for generic agents.

00:30:36.860 | These are the main agents in LangChain.

00:30:39.220 | There are others as well.

00:30:40.900 | And as I mentioned briefly earlier,

00:30:43.620 | you can check out agent toolkits, for example.

00:30:46.020 | There will be others in the future too.

00:30:48.100 | It's really worthwhile to see the docs

00:30:51.180 | and follow closely the developments

00:30:54.180 | that there are in LangChain.

00:30:56.900 | And there are things you can do.

00:30:58.100 | You can create your own agent.

00:30:59.660 | You can use agents with several other tools.

00:31:03.660 | And another thing worth mentioning

00:31:05.540 | is that you can use a tracing UI tool that

00:31:09.300 | is within LangChain, which will allow

00:31:12.140 | you to understand within a beautiful UI

00:31:15.660 | how the agent is thinking, or what different calls

00:31:19.580 | to different LLMs it did within its thought process.

00:31:22.540 | So that is really convenient when

00:31:24.380 | you're using complex agents with several tools.

00:31:26.980 | And it might be tricky to track what whole thought process

00:31:30.980 | and what intermediate answers did it get.

00:31:34.940 | So that is really recommended if you

00:31:37.900 | are using agents in a little bit more complex scenarios.

00:31:43.940 | So this is it for agents here in LangChain series.

00:31:48.660 | And I hope you really enjoyed this topic.

00:31:51.740 | I think, again, this is probably the most important topic

00:31:56.100 | in LangChain and the most interesting one.

00:31:58.420 | So yeah, see you in the next one.

00:32:00.900 | [MUSIC PLAYING]

00:32:04.260 | [MUSIC PLAYING]

00:32:07.980 | [MUSIC PLAYING]

00:32:11.500 | [MUSIC PLAYING]

00:32:14.860 | you

LangChain Agents Deep Dive with GPT 3.5 — LangChain #7

Chapters