back to index

LangChain Agents Deep Dive with GPT 3.5 — LangChain #7


Chapters

0:0 Why LLMs need tools
2:35 What are agents?
3:33 LangChain agents in Python
4:25 Initializing a calculator tool
5:57 Initializing a LangChain agent
8:1 Asking our agent some questions
12:39 Adding more tools to agents
14:29 Custom and prebuilt tools
16:40 Francisco's definition of agents
17:52 Creating a SQL DB tool
19:49 Zero shot ReAct agents in LangChain
24:18 Conversational ReAct agent in LangChain
26:57 ReAct docstore agent in LangChain
28:31 Self-ask with search agent
30:33 Final thoughts on LangChain agents

Whisper Transcript | Transcript Only Page

00:00:00.000 | Larger language models are incredibly powerful,
00:00:02.720 | as we've seen,
00:00:03.880 | but they lack some of the abilities
00:00:07.480 | that even the dumbest computer programs
00:00:09.840 | can handle with ease.
00:00:12.280 | Logic, calculations, and search
00:00:15.160 | are just a few examples
00:00:17.800 | of where large language models fail
00:00:20.160 | and really dumb computer programs
00:00:23.320 | can actually perform very well.
00:00:25.140 | We've been using computers
00:00:26.200 | to solve incredibly complex calculations
00:00:28.720 | for a very long time.
00:00:30.480 | Yet, if we ask GPT-4 to tell us the answer
00:00:34.240 | to what is 4.1 multiplied by 7.9,
00:00:37.840 | it actually fails.
00:00:39.520 | Isn't it fascinating
00:00:40.360 | that a simple calculator program can do this,
00:00:42.560 | but what is probably the most sophisticated AI program
00:00:46.940 | in the world right now
00:00:48.000 | that is accessible by us cannot?
00:00:50.760 | And that's not all.
00:00:51.720 | If I ask GPT-4,
00:00:53.840 | my somewhat overused example by now,
00:00:57.660 | of how do I use the LLM chain in LANG chain,
00:01:01.240 | it struggles again.
00:01:03.080 | It's true that LANG chain was a blockchain project.
00:01:06.280 | Yet, there didn't seem to be any LLM chain component
00:01:10.320 | nor LANG tokens.
00:01:12.680 | These are both hallucinations.
00:01:14.720 | Granted, the reason that GPT-4
00:01:17.080 | is unable to tell us about these
00:01:19.360 | is because it hasn't heard of LANG chain,
00:01:23.720 | or at least not LANG chain I'm referring to.
00:01:25.560 | That is because GPT-4 has no connection
00:01:28.260 | to the outside world.
00:01:29.900 | The only part of the outside world
00:01:32.180 | that GPT-4 has seen is what it saw during its training.
00:01:37.180 | And the training data cutoff of GPT-4
00:01:40.540 | appears to be around September, 2021.
00:01:43.580 | With what seem to be major weaknesses
00:01:46.420 | in today's large language models,
00:01:48.620 | we need to find solutions.
00:01:50.600 | One suite of potential solutions to these problems
00:01:54.900 | comes in the form of agents.
00:01:57.500 | These agents don't just solve many of the problems
00:02:00.540 | we saw above, but actually many others as well.
00:02:04.580 | In fact, by using agents,
00:02:06.340 | we actually have an almost unlimited upside
00:02:10.380 | in the potential of what we can do
00:02:12.740 | with large language models.
00:02:14.660 | So we're gonna learn what agents are
00:02:17.220 | and how we can use them within LANG chain library
00:02:20.700 | to superpower our large language models.
00:02:23.540 | What we'll do is I'll quickly go through
00:02:26.260 | an introduction to agents in LANG chain,
00:02:28.660 | and then I'll hand it over to Francisco
00:02:30.380 | for more of a deep dive into agents in LANG chain.
00:02:33.660 | So let's jump straight into it.
00:02:35.260 | We can think of agents as enabling tools
00:02:39.500 | for large language models.
00:02:41.300 | Kind of like how a human like I
00:02:44.100 | would use a calculator for maths,
00:02:46.480 | or I might perform a Google search for information.
00:02:49.540 | Agents allow a large language model
00:02:52.540 | to do the same thing.
00:02:54.460 | Using agents, a large language model
00:02:56.580 | can write and execute Python code.
00:02:59.340 | It can perform Google search,
00:03:01.340 | and it can even perform SQL queries.
00:03:04.260 | Let's start with a very simple example of this.
00:03:08.620 | What we're going to do is build a calculator agent
00:03:12.880 | that can also handle some general knowledge queries.
00:03:17.180 | Now to use agents in LANG chain,
00:03:19.340 | we need three key components.
00:03:22.420 | That is a large language model
00:03:24.540 | or multiple large language models,
00:03:26.380 | a tool that we will be interacting with,
00:03:29.420 | and an agent to control that interaction.
00:03:32.940 | Let's start by installing LANG chain
00:03:36.380 | and initializing our large language model.
00:03:39.500 | So we're in Colab here.
00:03:41.180 | There will be a link to this notebook
00:03:43.020 | somewhere near the top of the video right now.
00:03:45.020 | And we do a pip install LANG chain and OpenAI,
00:03:49.180 | because we're going to be using
00:03:50.500 | OpenAI's large language models here.
00:03:53.140 | But you can replace this.
00:03:54.220 | You can use Cohere, you can use Hug and Face models.
00:03:58.900 | And I'm not sure if it's implemented yet,
00:04:01.700 | but I'm sure pretty soon
00:04:03.540 | there'll probably be Google Palm models in here as well.
00:04:06.660 | Okay, cool.
00:04:07.780 | So we need to first start by initializing
00:04:11.660 | our large language model.
00:04:13.380 | We're using text of ng003 here.
00:04:16.060 | You can replace that obviously
00:04:17.460 | with more recent models as well.
00:04:19.260 | Okay, I've already run this with my API key in there.
00:04:22.340 | So move on.
00:04:24.380 | Next, we want to initialize
00:04:27.620 | what is going to be a calculator tool
00:04:29.300 | using the LLM math chain.
00:04:32.300 | So the LLM math chain,
00:04:33.460 | if you've watched previous videos in this series,
00:04:36.460 | you've probably seen it.
00:04:38.940 | It is basically a large language model
00:04:42.380 | that will call to Python
00:04:45.820 | with some code for performing a calculation.
00:04:49.700 | Okay, that's what this is here, right?
00:04:52.580 | And what we're doing here
00:04:54.140 | is we're formatting it into what we call a tool.
00:04:57.460 | Now a tool, it is simply the run,
00:05:02.060 | so the functionality of this LLM math chain,
00:05:06.180 | plus a name for that chain, for that function,
00:05:10.700 | and also a description.
00:05:11.940 | Now this description is useful.
00:05:13.700 | This is essentially the prompt.
00:05:16.540 | This will be included in the prompt
00:05:18.580 | for the agent executor large language model.
00:05:23.340 | And we're basically gonna say,
00:05:25.820 | look, this is a tool that you can use, right?
00:05:28.580 | It's called calculator,
00:05:30.100 | and you should use it when you think this is relevant.
00:05:34.620 | Okay, so we're saying it's useful
00:05:36.340 | for when you need to answer questions about math, right?
00:05:39.500 | That's how the large language model
00:05:41.820 | will know when to use this tool.
00:05:44.260 | Now, when we're passing those tools
00:05:45.980 | to the large language model,
00:05:47.020 | we will actually pass a list of tools, okay?
00:05:50.780 | So for now, there's just one item in that list,
00:05:53.860 | but we can add multiple, as we'll see very soon.
00:05:56.940 | Now, when we're using these tools,
00:05:59.620 | we need to initialize an agent that will contain them.
00:06:03.900 | It's gonna contain like a base large language model,
00:06:07.100 | which is gonna control.
00:06:08.620 | It's kind of like the orchestrator,
00:06:10.700 | the top level large language model.
00:06:13.140 | And there are also these different types of agents, okay?
00:06:17.940 | So this one is a zero-shot react description agent.
00:06:22.220 | That means a few things.
00:06:24.340 | So zero-shot part of that means
00:06:27.220 | that the agent is currently looking
00:06:29.140 | at just the current prompt.
00:06:31.780 | It doesn't have any memory, right?
00:06:34.020 | So if you've got a chatbot,
00:06:35.860 | you probably wouldn't necessarily use this one,
00:06:38.460 | but we'll cover some alternatives later on.
00:06:41.500 | So for it being zero-shot, it doesn't have that memory
00:06:45.300 | and it's focusing on the current action only.
00:06:48.380 | React is like an agent framework.
00:06:53.500 | The idea behind that is that we,
00:06:57.100 | or the agent will reason about
00:07:00.060 | whatever prompt has been assigned to it.
00:07:02.500 | It will decide on an action to take,
00:07:04.580 | and then it will pass some information to that action item.
00:07:08.660 | The action item is the tools that we're using here.
00:07:12.020 | And then it will get a response from the action item,
00:07:15.260 | that tool, and then repeat this process
00:07:18.020 | of reasoning and action.
00:07:20.500 | We're not gonna go too into detail on that here.
00:07:23.860 | That kind of deserves its own video, I think.
00:07:26.820 | And as for which action to take,
00:07:29.460 | the agent is basing that on the description
00:07:33.380 | of each of the tools that we have.
00:07:35.220 | Okay, so let's initialize that.
00:07:39.420 | One other important thing here is these max iterations.
00:07:42.580 | So I kind of mentioned just now,
00:07:44.700 | it can go through a loop of reason and action.
00:07:48.620 | We add max iterations so that it doesn't just go
00:07:51.540 | into an infinite loop of reasoning and making actions
00:07:54.020 | and so on and so on, just forever, right?
00:07:56.460 | So we kind of limit it to,
00:07:58.260 | you can do three of these loops and that cuts off.
00:08:01.820 | Okay, cool.
00:08:02.660 | So let's ask some questions.
00:08:05.820 | So the zero shot agent,
00:08:07.820 | we can see the thought process of the agent here, right?
00:08:12.260 | So saying, huh, okay, I need to calculate this expression.
00:08:16.420 | So what is the action I need to take?
00:08:18.460 | I need to use a calculator.
00:08:20.460 | What's the input I'm going to pass to this tool?
00:08:24.540 | It is this here, right?
00:08:26.020 | This is our, what we've asked it to calculate here.
00:08:29.460 | Okay, and then the response from that agent
00:08:33.540 | from the LLM math chain is this here.
00:08:36.260 | Okay, this number.
00:08:37.860 | Okay, and it says the next thought it has is,
00:08:40.580 | I now know the final answer.
00:08:42.580 | It is this, right?
00:08:45.100 | And then we finished the chain
00:08:46.820 | or the agent execution chain.
00:08:49.420 | So the input was this and the output we get is this, right?
00:08:55.060 | Now the end user doesn't need to see all of this.
00:08:57.820 | The reason that we're seeing all of this
00:08:59.700 | is because we set verbose equal to true.
00:09:02.340 | Set this to false, we'll just get the output, right?
00:09:06.700 | But obviously we're developing this.
00:09:09.260 | We want to see what is actually happening.
00:09:11.660 | Now, what you can see, I've already run this.
00:09:15.020 | The output, the actual value here
00:09:19.540 | is in fact this number here.
00:09:21.220 | Okay, so that's much better
00:09:23.020 | than when we're using large language models typically,
00:09:26.900 | and they just give us like a completely wrong answer.
00:09:30.220 | All right, so that looks good.
00:09:31.660 | Let's try something else.
00:09:33.540 | Now, here we're kind of using natural language
00:09:37.140 | to kind of define the calculation that needs to happen.
00:09:42.140 | We say Mary has four apples.
00:09:44.260 | Giorgio brings two and a half apple boxes,
00:09:47.380 | where an apple box contains eight apples.
00:09:49.620 | How many apples do we have, right?
00:09:52.060 | It's just kind of simple logic and calculation here.
00:09:57.220 | But naturally it needs to use both the calculator chain,
00:10:01.900 | the LLM math chain, and also its own sort of LLM ability
00:10:06.540 | to reason about what is happening here.
00:10:09.060 | So it starts by I need to figure out
00:10:11.700 | how many apples are in the boxes, right?
00:10:14.060 | So it's doing this step by step,
00:10:15.500 | which I think is really cool.
00:10:17.300 | So it says there's two and a half boxes,
00:10:19.780 | and each box contains eight apples.
00:10:21.900 | So there's 20 apples in the boxes.
00:10:24.700 | And it says I need to add the apples Mary has
00:10:27.460 | to the apples in the boxes.
00:10:29.380 | Okay, we do calculate again, it's four plus 20, 24.
00:10:33.980 | And now it's like, I now have the final answer.
00:10:37.180 | We have 24 apples.
00:10:39.180 | Cool, so we see the input and we see the output.
00:10:42.580 | We have 24 apples, great.
00:10:44.700 | All of this so far that we've done here,
00:10:47.100 | a LLM math chain could probably do,
00:10:51.940 | it could definitely do this.
00:10:53.740 | And there's probably a good chance it could do this.
00:10:57.140 | It wouldn't go through the multiple reasoning steps,
00:10:59.980 | but it might be able to do this in a single step.
00:11:03.060 | So with that in mind,
00:11:05.620 | why don't we just use the LLM math chain?
00:11:08.420 | Well, what if we ask this, what is the capital of Norway?
00:11:12.180 | An LLM math chain by itself
00:11:17.220 | is not going to be able to answer this
00:11:18.580 | because it's going to go to the calculator, right?
00:11:22.620 | So we say, I need to look at the answer.
00:11:25.540 | Okay, so right now it's, what is it doing?
00:11:27.580 | This is something new.
00:11:29.300 | That's because it's seeing what is the capital of Norway
00:11:32.860 | in the base or the agent executes prompt.
00:11:36.140 | It's saying to answer this prompt,
00:11:38.460 | you need to use one of the following tools,
00:11:41.420 | but the only tool that we're given is a calculator.
00:11:43.580 | The LLM knows that a calculator
00:11:45.940 | isn't going to give it the answer
00:11:47.620 | to what is the capital of Norway.
00:11:49.020 | So it's actually hallucinating
00:11:51.300 | and trying to find another tool that it believes would help.
00:11:55.420 | And a lookup tool would help here,
00:11:58.300 | but it doesn't actually have access to a lookup tool.
00:12:00.780 | It's just kind of imagining that it does,
00:12:02.900 | it's hallucinating that it does.
00:12:04.460 | Okay, so it's like, okay,
00:12:06.540 | the only tool I actually have here is a calculator.
00:12:09.860 | And obviously you pass this to the calculator tool
00:12:11.900 | and the answer is it's not going to work, right?
00:12:16.220 | So it actually just says action input NA.
00:12:19.740 | Like I can't, I don't even know
00:12:21.180 | what to give to this calculator.
00:12:23.460 | And then we'll get a value error.
00:12:24.980 | Okay, right, that's fine.
00:12:26.620 | It's kind of expected,
00:12:27.820 | we only have a calculator for this agent to use.
00:12:31.140 | It's not going to be able to answer this question.
00:12:32.900 | But what if we do want it to be able to answer
00:12:34.900 | general knowledge questions
00:12:36.340 | as well as perform calculations?
00:12:38.780 | Well, okay, in that case,
00:12:41.380 | what we need to do is add another tool
00:12:43.700 | to the agent's toolbox.
00:12:47.140 | What we're going to do is just initialize
00:12:49.940 | a simple LLM chain, right?
00:12:52.340 | So to answer what is the capital of Norway,
00:12:56.020 | a simple LLM can do that, right?
00:12:58.540 | So all we're going to do is create this LLM chain.
00:13:02.260 | We're not doing it, we're not really doing anything here.
00:13:04.100 | We just got a prompt template,
00:13:06.060 | which is just going to pass the query
00:13:08.580 | straight to the LLM chain.
00:13:11.460 | Okay, and we're going to call this one the language model.
00:13:15.940 | And we're going to say, use this tool
00:13:17.620 | for general purpose queries and logic.
00:13:20.260 | Okay, cool.
00:13:21.780 | And we just add the new tool to our tools list, like this.
00:13:25.940 | Right, so now we've got two tools in that tool list.
00:13:29.980 | And we just reinitialize the agent with our two tools,
00:13:33.060 | the calculator and the language model.
00:13:35.900 | Okay, now we say, what is the capital of Norway?
00:13:39.700 | And it should say, ah, okay,
00:13:41.980 | I can refer to the language model for this.
00:13:44.900 | And we say, what is the capital of Norway?
00:13:47.420 | The capital of Norway is Oslo.
00:13:48.940 | Okay, so we get the correct answer this time.
00:13:51.180 | And yeah, I now know the final answer,
00:13:53.740 | capital of Norway is Oslo.
00:13:55.180 | Now we can ask it a math question.
00:13:56.820 | Okay, what is this?
00:13:58.580 | Okay, so it can answer math questions as well now.
00:14:01.340 | So all of a sudden our agent is able to do
00:14:06.100 | two completely different things
00:14:08.180 | that we would need separate LLM chains or chains for,
00:14:12.940 | which I think is really cool.
00:14:14.340 | And these are two super simple examples.
00:14:17.340 | Francisco is in a moment going to go through,
00:14:20.500 | I think what far more interesting examples
00:14:22.940 | and you'll definitely see more
00:14:25.460 | of how we can use these agents in LLM chain.
00:14:29.740 | But before I finish,
00:14:30.980 | there is just one thing I should point out
00:14:32.980 | because I kind of missed it for the sake of simplicity,
00:14:35.620 | but it wouldn't be fair for me to not mention it,
00:14:38.220 | is up here we defined our math tool here, right?
00:14:44.220 | In reality, there is already a math tool
00:14:48.940 | and a set of pre-built tools that come with LLM chain.
00:14:53.940 | And to use those, we would write something like
00:14:56.940 | from LLM chain, agents, import load tools.
00:15:01.940 | And from there, we would just do tools, load tools.
00:15:11.620 | And then here we actually pass a list of the tools
00:15:14.500 | that we would like to load.
00:15:15.740 | So if we just want the LLM math chain again,
00:15:18.420 | we'd write that.
00:15:19.260 | Now the LLM math chain does require LLM.
00:15:22.740 | So we also need to pass LLM in there as well, right?
00:15:27.740 | So let me, let's run this again.
00:15:31.980 | If we look at tools, name and tools description,
00:15:37.820 | (keyboard clicking)
00:15:40.740 | we see we have calculator,
00:15:43.780 | useful for when you need to answer questions about math.
00:15:46.620 | And if you print out a full tool list,
00:15:48.380 | it's going to show you a ton of other things as well,
00:15:50.540 | including your open AI API key,
00:15:54.380 | which is very useful for when you're trying to show people
00:15:58.580 | what is in there.
00:15:59.700 | So if we try the same again with this,
00:16:02.740 | we can, let's just copy.
00:16:05.740 | And we'll see that we actually get the same thing,
00:16:07.380 | we have the same tool name, we have the same description.
00:16:09.460 | And if you print out full thing,
00:16:10.860 | you'll also see all of the parameters
00:16:13.300 | that define the tool and chain being used.
00:16:17.500 | So these two bits of code here do the exact same thing,
00:16:21.260 | right?
00:16:22.100 | I'm just saying that you can initialize a tool by yourself
00:16:24.780 | or you can use the pre-built tools as well.
00:16:27.500 | Now, I think I've talked for long enough.
00:16:30.980 | What we'll do is we'll pass it over to Francisco,
00:16:33.340 | who's going to take us through these tools and agents
00:16:36.820 | in a lot more detail.
00:16:38.060 | So over to Francisco.
00:16:40.620 | - Thanks James for that introduction.
00:16:42.460 | And now we will be diving into agents.
00:16:45.820 | Agents are arguably the most important
00:16:47.980 | building block in Lankchain.
00:16:48.940 | So it's really important that we get them right.
00:16:51.780 | And we'll be seeing a few examples
00:16:53.900 | to really understand how they work.
00:16:55.700 | As always, we need to initialize our OpenAI LLM
00:17:01.820 | and we will get into the definition here.
00:17:04.660 | So the official definition is that agents use LLMs
00:17:08.540 | to determine which actions to take and in what order.
00:17:11.380 | And an action can be using a tool or returning to the user.
00:17:16.220 | But if we think about it more intuitively,
00:17:20.380 | what an agent does is it applies reasoning
00:17:23.580 | to use several tools collectively and in unison
00:17:28.580 | to build an answer for the user.
00:17:31.660 | So this is the key behind agents.
00:17:34.140 | It's the agents are reasoning about what tools
00:17:38.060 | they need to use and how they need to use them
00:17:41.140 | and how they need to combine their outputs
00:17:43.500 | to actually give a right answer.
00:17:45.740 | And this is a really, really powerful framework
00:17:49.060 | as we will see in the next examples.
00:17:52.220 | So now we will be creating a database.
00:17:54.660 | It's not really important to understand
00:17:56.220 | exactly what we're doing to create this database.
00:17:59.500 | The important thing here is that we're going
00:18:01.260 | to build an SQL database with one table,
00:18:04.580 | which is a table with stocks.
00:18:07.420 | And we will add some observations
00:18:09.180 | because we will want our agent
00:18:10.460 | to interact with these observations.
00:18:12.220 | Here we have a few stocks, two stocks actually,
00:18:15.220 | with the different prices in different date times.
00:18:18.340 | And the important part where we will be creating
00:18:22.340 | the tool for the agent to use.
00:18:24.860 | So here we create a chain which uses
00:18:28.700 | the database we just created.
00:18:30.820 | This is the engine we just created over here.
00:18:35.820 | And we will create a database chain from that engine.
00:18:40.900 | Now we will build a tool.
00:18:43.980 | And here, just a small definition of what a tool is.
00:18:47.980 | A tool is a function or a method that the agent can use
00:18:52.980 | when it thinks it is necessary.
00:18:55.740 | So how will the agent know if it's necessary?
00:18:58.860 | Well, we're giving it a description here.
00:19:00.460 | So we're telling the agent when it should be used.
00:19:03.300 | Here we're giving the agent the function it should run.
00:19:07.380 | And here's the name.
00:19:09.420 | So the agent will ask our chain
00:19:13.460 | for a question about stocks and prices.
00:19:16.620 | And our chain will answer that question
00:19:20.820 | using the data from the database.
00:19:23.700 | So just a few clarifications
00:19:27.460 | before we dive into our first agent type.
00:19:29.620 | We will see different agent types in this deep dive.
00:19:32.980 | But each one has three variables.
00:19:35.500 | So we have to define the tools.
00:19:37.260 | We want to give the agent the LLM
00:19:40.180 | because all agents are LLM based.
00:19:43.260 | And the agent type.
00:19:44.460 | So what type of agent we want to use.
00:19:47.900 | We will start with this agent
00:19:49.740 | which is a zero-shot React agent.
00:19:51.860 | And we will first need to initialize our agent.
00:19:57.540 | And this agent will be able to basically reason
00:20:00.980 | about our question, gather information from tools,
00:20:04.740 | and answer.
00:20:05.620 | That's what the zero-shot agent does.
00:20:09.060 | So here we will load one tool,
00:20:11.540 | which is the LLM math tool,
00:20:13.140 | which we already saw with James.
00:20:15.780 | And we will append the SQL tool we saw before.
00:20:21.460 | And as the name suggests here,
00:20:22.740 | we will use this agent to perform zero-shot tasks.
00:20:24.900 | So we will not have many interactions with this agent,
00:20:28.060 | but only one.
00:20:29.260 | Or at least we'll have different interactions,
00:20:31.700 | but they will be isolated from each other.
00:20:34.220 | And we will be asking questions
00:20:36.740 | that can be answered with the tools,
00:20:38.460 | and the agent will help us answer them.
00:20:40.860 | One note here, it's important to have in mind
00:20:44.180 | that we should always set the max iterations.
00:20:46.620 | So the max iterations with React agents
00:20:49.820 | basically means how many thoughts it can have
00:20:53.460 | about our question and about using tools.
00:20:57.860 | So basically what this agent does
00:21:00.100 | is it thinks about what it needs to do,
00:21:02.060 | and then it does it.
00:21:03.220 | And one of the actions it can do
00:21:04.580 | is refer to one of the tools.
00:21:06.820 | So just to avoid the agent getting into an infinite loop
00:21:10.780 | and using tools indefinitely,
00:21:12.900 | we should always set a max iterations
00:21:14.620 | that we're comfortable with.
00:21:16.100 | And depending on the use case, that might change,
00:21:18.580 | but it's something that is useful to take into account.
00:21:21.500 | Here, we will set max iterations to three.
00:21:23.820 | All right, so here we will create our agent,
00:21:28.140 | and we will ask it a very complex question,
00:21:30.300 | which is what is the multiplication of the ratio
00:21:32.220 | between stock prices for ABC and XYZ in two different dates?
00:21:36.300 | So it involves quite a lot of steps,
00:21:39.020 | and we will try to understand what the agent is doing here.
00:21:41.700 | So first it needs to compare the prices,
00:21:45.300 | or it knows it needs to compare these prices
00:21:47.660 | on two different days.
00:21:48.620 | It queries the SQL database for the 3rd and the 4th of January
00:21:54.100 | with the prices for ABC and XYZ.
00:21:57.020 | Here, we can see that the query is generated,
00:22:00.580 | and we get the results.
00:22:02.340 | And then the agent is getting the answer from this tool,
00:22:06.660 | which returns the actual data that was requested,
00:22:09.740 | which is this piece here.
00:22:12.460 | Now, the agent, with this information,
00:22:14.740 | is determining that these are the right prices,
00:22:19.700 | and now it needs to make a calculation.
00:22:23.300 | So how will it make this calculation with the calculator?
00:22:26.340 | Let's see what it's doing.
00:22:28.220 | It's calculating the ratio first
00:22:30.220 | of the two prices on each day,
00:22:32.780 | and then when it has the two ratios with the calculator,
00:22:35.620 | it's using the calculator another time
00:22:37.580 | to calculate the multiplication between the two ratios.
00:22:40.140 | So this framework is really powerful,
00:22:43.140 | because as I said before,
00:22:44.460 | it's combining reasoning with the tools,
00:22:46.860 | and in several steps, it's finding
00:22:49.220 | and converging to the right answer by using these tools.
00:22:51.900 | So this is the key takeaway here
00:22:54.460 | for this Zero-Shot React description agent.
00:22:58.020 | We can see the prompt, just as a quick pass here.
00:23:01.620 | What we're teaching it to do
00:23:03.500 | is we're sending it, obviously, the tools
00:23:06.460 | that it needs to use,
00:23:08.020 | and then we're asking it to use this question,
00:23:11.180 | thought, action, action, input, observation framework,
00:23:14.860 | and this framework can be repeated n times
00:23:18.900 | until it knows the final answer,
00:23:20.660 | or it reaches max iterations, which we referred to earlier.
00:23:25.140 | This is a very powerful framework.
00:23:26.980 | You can check it out in the Miracle paper,
00:23:29.700 | which we'll link below,
00:23:30.700 | but the important thing here to know
00:23:32.740 | is that it enables us to combine reasoning with tools,
00:23:36.980 | and this is the type of things
00:23:38.700 | that you can do with agents.
00:23:41.260 | The level of abstraction is much higher
00:23:42.940 | than using a tool in isolation.
00:23:44.580 | So basically, the LLM now has the ability to reason
00:23:48.940 | how to best use this tool,
00:23:50.900 | and basically, another thing that is important here
00:23:54.100 | is the agent's scratchpad within the prompt,
00:23:57.180 | and that is where every thought and action
00:23:59.780 | that the agent has already performed will be appended,
00:24:02.700 | and thus, the agent will know at each point in time
00:24:06.260 | what it has found out until that moment,
00:24:08.860 | and will be able to continue
00:24:10.580 | that thought process from there.
00:24:12.260 | So that's where the agent notes its previous thoughts
00:24:16.340 | during the thought process.
00:24:17.700 | Okay, so now we're ready for our second type of agent.
00:24:22.020 | This type of agent is really similar to the first one,
00:24:24.180 | so let's take a look.
00:24:26.020 | This is called conversational react,
00:24:27.820 | and this is basically the last agent, but with memory.
00:24:32.420 | So we can interact with it in several interactions,
00:24:36.260 | and ask it questions from things that we have already said.
00:24:39.740 | It is really useful to have a chatbot.
00:24:43.980 | Basically, it's the basis for chatbots in Langchain,
00:24:48.380 | so it's really useful.
00:24:51.140 | Again, we load the same tools here.
00:24:53.020 | We will add memory, as we said,
00:24:56.260 | so we will ask it a similar question,
00:24:58.260 | a bit different, a bit simpler,
00:25:00.060 | than we asked the previous agent.
00:25:01.180 | So we will ask it what the ratio of stock prices is
00:25:03.900 | on January the 1st for these two stocks.
00:25:06.180 | Let's see what it answers.
00:25:07.980 | So it's getting the stock data,
00:25:11.420 | and it calculated without the calculator,
00:25:14.060 | since it's a simple calculation,
00:25:16.740 | the actual ratio we need.
00:25:18.620 | So let's see the prompt here.
00:25:19.780 | So we will see the prompt is quite similar,
00:25:23.620 | but it includes this prefix, we could say,
00:25:27.620 | where we are telling the agent that it is an assistant,
00:25:31.380 | that it can assist with a wide range of tasks,
00:25:33.980 | and basically that it's here to assist
00:25:36.540 | and to answer questions or have conversations.
00:25:40.100 | So this is the main thing behind this agent.
00:25:43.380 | And also, we can see here we have a chat history variable,
00:25:47.660 | where we will be including the memory for this agent.
00:25:50.980 | So these are the main differences.
00:25:55.060 | If we ask it exactly the same question
00:25:57.500 | as we asked the previous agent, let's see what happens.
00:26:01.300 | It's using the chain, it's getting the data,
00:26:05.740 | but we are getting the right answer,
00:26:10.340 | but the action input for the chain
00:26:14.020 | was already to get the ratio.
00:26:16.940 | And then it didn't use the tool to multiply,
00:26:19.380 | it just multiplied on its own, right,
00:26:22.420 | without using a tool.
00:26:23.660 | So it made this decision of not using the calculator,
00:26:27.540 | whereas our previous agent had decided
00:26:29.940 | to use the calculator here.
00:26:31.780 | And these two agents sometimes don't behave exactly the same
00:26:35.700 | because the prompts are different.
00:26:37.660 | Perhaps for this agent, we're telling it
00:26:40.100 | that it can solve a wide range of tasks,
00:26:43.260 | so maybe it's getting the confidence
00:26:46.180 | to try out some math there.
00:26:47.860 | But in essence, what they're doing is quite similar,
00:26:52.460 | but this agent is including a conversational aspect
00:26:55.740 | and a memory aspect.
00:26:57.420 | All right, now we will see our final two agents.
00:27:00.980 | And this first one is called the React Docs Store agent,
00:27:03.660 | and it is made to interact with a document store.
00:27:07.220 | So let's say that we want to interact with Wikipedia,
00:27:10.900 | and we will need to send it two tools,
00:27:13.340 | one for searching articles, and the other one
00:27:16.100 | is for looking up specific terms in the article it found.
00:27:20.460 | And this is, if we think about it,
00:27:22.140 | what we do when we search a document store like Wikipedia.
00:27:24.860 | We search for an article that might have the answer
00:27:27.500 | to our question, and then we search within the article
00:27:30.500 | for the specific paragraph or snippet where the question is.
00:27:34.740 | So we will do exactly that.
00:27:37.220 | We will initialize this agent here,
00:27:38.820 | and we will run what were Archimedes' last words.
00:27:42.420 | Let's see.
00:27:43.820 | So it's entering the chain.
00:27:45.580 | It searches for Archimedes.
00:27:47.420 | That's interesting.
00:27:48.820 | And let's see.
00:27:50.340 | So it has its first observation.
00:27:52.860 | This observation is the first paragraph of the article.
00:27:56.060 | It didn't find the answer there, and then it
00:27:58.980 | looks up last words, and it finds
00:28:00.780 | the answer within that document.
00:28:03.380 | So that's basically how it works.
00:28:05.820 | We can see more about this in the prompt.
00:28:08.860 | There's a few examples.
00:28:09.900 | We will not print it because it's really large,
00:28:12.380 | but you can do so.
00:28:13.340 | And here is the paper for this agent as well.
00:28:16.820 | So yeah, this agent is basically useful
00:28:20.100 | for interacting with very large document source.
00:28:23.020 | And we can think here that this is basically the same thing
00:28:26.980 | that we would do.
00:28:28.220 | So the agent does it for us, basically.
00:28:31.300 | And finally, we have this last agent,
00:28:33.820 | which is called the self-ask with search.
00:28:36.780 | And basically, here the name is telling us a lot
00:28:40.260 | about how this agent works.
00:28:42.180 | It's asking questions, which are not the user question,
00:28:47.620 | obviously.
00:28:48.460 | But they are intermediate questions
00:28:50.060 | that it's asking to understand all the pieces of data
00:28:52.980 | it needs to build the last answer it
00:28:55.820 | needs to give to the user.
00:28:57.700 | So why will it ask all these follow-up questions
00:29:01.420 | on the user's original query?
00:29:03.660 | Because it will use a search engine
00:29:06.260 | to find the intermediate answers,
00:29:08.660 | and then build to the last final answer to the user.
00:29:12.660 | So as we can see here, we need to send it
00:29:15.020 | one tool, which is called the intermediate answer tool.
00:29:18.140 | And it will search.
00:29:19.980 | It must be some kind of search.
00:29:22.340 | Here, we are using Google Search.
00:29:24.100 | We will not showcase this functionality.
00:29:26.420 | For that, you need an API key by using this API wrapper.
00:29:30.820 | But we will initialize this agent,
00:29:33.660 | and we will see its prompt, which is basically enough
00:29:36.780 | to understand how it works.
00:29:39.620 | So here, we can see how it works.
00:29:42.420 | So the original question would be,
00:29:45.340 | who lived longer, Muhammad Ali or Alan Turing?
00:29:47.940 | And the follow-up questions needed here,
00:29:50.260 | it determines that it needs follow-up questions,
00:29:52.260 | and it starts asking them.
00:29:53.340 | So how old was Muhammad Ali when he died?
00:29:55.700 | Intermediate answer found by searching.
00:29:58.300 | Muhammad Ali was 74 years old.
00:30:00.900 | How old was Alan Turing?
00:30:02.740 | Alan Turing was 41 years old.
00:30:04.620 | So the final answer is Muhammad Ali.
00:30:06.620 | So this is the kind of logic that the agent follows.
00:30:09.620 | And to get these intermediate answers,
00:30:11.620 | it needs some kind of search tool.
00:30:14.140 | This is one option that LangChain provides.
00:30:16.500 | There might be others, but it needs to be able to search.
00:30:19.340 | And that's the main characteristic
00:30:21.220 | of this agent, which is that it gets intermediate answers
00:30:25.140 | to follow-up questions by searching.
00:30:28.700 | So you have the paper here also, if you want to dive deeper.
00:30:32.260 | And this is it.
00:30:34.540 | We're wrapping up for generic agents.
00:30:36.860 | These are the main agents in LangChain.
00:30:39.220 | There are others as well.
00:30:40.900 | And as I mentioned briefly earlier,
00:30:43.620 | you can check out agent toolkits, for example.
00:30:46.020 | There will be others in the future too.
00:30:48.100 | It's really worthwhile to see the docs
00:30:51.180 | and follow closely the developments
00:30:54.180 | that there are in LangChain.
00:30:56.900 | And there are things you can do.
00:30:58.100 | You can create your own agent.
00:30:59.660 | You can use agents with several other tools.
00:31:03.660 | And another thing worth mentioning
00:31:05.540 | is that you can use a tracing UI tool that
00:31:09.300 | is within LangChain, which will allow
00:31:12.140 | you to understand within a beautiful UI
00:31:15.660 | how the agent is thinking, or what different calls
00:31:19.580 | to different LLMs it did within its thought process.
00:31:22.540 | So that is really convenient when
00:31:24.380 | you're using complex agents with several tools.
00:31:26.980 | And it might be tricky to track what whole thought process
00:31:30.980 | and what intermediate answers did it get.
00:31:34.940 | So that is really recommended if you
00:31:37.900 | are using agents in a little bit more complex scenarios.
00:31:43.940 | So this is it for agents here in LangChain series.
00:31:48.660 | And I hope you really enjoyed this topic.
00:31:51.740 | I think, again, this is probably the most important topic
00:31:56.100 | in LangChain and the most interesting one.
00:31:58.420 | So yeah, see you in the next one.
00:32:00.900 | [MUSIC PLAYING]
00:32:04.260 | [MUSIC PLAYING]
00:32:07.980 | [MUSIC PLAYING]
00:32:11.500 | [MUSIC PLAYING]