back to indexLangChain Expression Language (LCEL) Explained!
Chapters
0:0 LangChain Expression Language LCEL
1:6 Getting Started with LCEL
6:11 How LCEL Pipe Operator Works
12:21 Using LangChain Expression Language
14:16 LCEL Runnables
19:36 LCEL Runnable Lambdas
23:36 Pros and Cons of LCEL
00:00:00.000 |
Today we're going to be talking about the Lang chain expression language, 00:00:03.360 |
which is a pretty interesting idea that essentially allows us to write very 00:00:08.440 |
minimalist code to build chains within Lang chain. 00:00:13.960 |
And for sure, I think we'll see from this video, 00:00:17.360 |
we can use a lot of Lang chains more advanced features like parallel execution, 00:00:21.960 |
async and streaming very easily using the expression language rather than just 00:00:27.840 |
the more typical approach to building Lang chain chains. And in my opinion, 00:00:34.400 |
I think we'll see that just using this, you can build stuff very quickly. 00:00:38.800 |
That's not to say it doesn't have its cons, but we'll dive into those later. 00:00:42.680 |
So let's just begin with what this expression language actually is. 00:00:47.960 |
So there's a page here in the Lang chain docs talking about this expression 00:00:58.760 |
And yeah, they just explain a few things, you know, 00:01:03.360 |
your streaming async parallel execution, so on and so on. Right. 00:01:06.480 |
But let's just jump into this notebook and we'll see more of how this actually 00:01:11.080 |
So there'll be a link to this notebook as they usually is at the top of the 00:01:16.680 |
And I've written all this in Colab, so you can do the same. 00:01:21.800 |
It's pretty straightforward. We have a few prerequisites. 00:01:25.920 |
We're going to be using Lang chain, of course. 00:01:27.960 |
We're going to be using Anthropic, the new Claude 2.1 model for our LLM. 00:01:34.480 |
We're going to be using Cohere for the embeddings, 00:01:39.440 |
just so I can give you an example of parallel retrieval later on, 00:01:45.560 |
the main things that I think we want to use the expression language for 00:01:56.400 |
We have those advanced features, streaming async parallel execution, 00:02:00.640 |
just work out of the box with these super fast and easy to set up. 00:02:04.760 |
And there's also easy integration with the other Lang chain products. 00:02:09.240 |
So Langsmith and Langserv, if you are using those. 00:02:13.200 |
Now let's take a look at what it actually looks like. 00:02:18.000 |
So to get started with this, we're going to need a Anthropic API key, 00:02:21.480 |
and you can get that by going to console.anthropic.com. 00:02:27.680 |
Hopefully you have an account already and you can click get API keys. 00:02:31.840 |
And you're just going to get your API keys from that. 00:02:35.880 |
I think there's still like a very minor wait list. 00:02:40.560 |
I just recommend you sign up and you you'll get access pretty soon, 00:02:45.240 |
but so that you're not waiting, you can also just use OpenAI. 00:02:48.760 |
So you would just swap chat on Anthropic here with chat, 00:02:53.960 |
OpenAI and swap Anthropic API key for OpenAI API key. 00:02:59.080 |
And if you do that, you will also want to drop, 00:03:03.000 |
just drop these two arguments and make things easier. 00:03:05.480 |
So looking at this, let's see, we'll put our API key in here. 00:03:10.920 |
And once we have that, we now have these three components. 00:03:14.880 |
We have a prompt, a model, which is a chat model and a output parser. 00:03:23.080 |
we would chain these together using the LLM chain. Okay. 00:03:26.600 |
So you can see LLM chain or your prompt, the LLM and the output parser. 00:03:37.240 |
We're asking to give me a small report about a particular topic. Okay. 00:03:42.400 |
So the, the input to that is going to be topic. And you can see that here. 00:03:49.080 |
and it's obviously just going to output a small report on that. Okay. 00:03:54.680 |
So let's run that and see what we get. So it's running, uh, 00:04:03.480 |
and we get this small like report thing on, on AI. 00:04:12.200 |
Now, how would we do that with the expression language? Well, we use this, 00:04:17.680 |
and I'm going to go into detail as to how this actually functions, 00:04:21.800 |
because I think that's understanding how this pipe operator functions allows us 00:04:26.800 |
to just understand what is actually happening here. Okay. 00:04:29.600 |
So that we can actually understand this abstraction rather than just blindly 00:04:39.360 |
right? So we have a prompt followed by the model followed by output parser, 00:04:42.880 |
rather than putting them into an LLM chain or some other chain, 00:04:45.720 |
we just string them together with this pipe operator. 00:04:49.920 |
So, I mean, it's like, for sure, if I look at this, 00:04:54.000 |
it's kind of, it's simpler than this, right? If you compare those two, 00:04:58.880 |
it's, I would say also more flexible because we can just string things together, 00:05:08.600 |
it's not so Pythonic as to what we're used to, 00:05:11.560 |
whether or not that is a good or bad thing I'm undecided on. 00:05:17.480 |
Like I really, I like the minimalist approach here. It looks great, 00:05:22.800 |
but it's maybe hard to understand. Like if you, 00:05:26.600 |
if you don't understand the syntax and you understand Python very well, 00:05:30.880 |
this can be pretty confusing. Anyway, let's run that. 00:05:35.080 |
So we create our chain using this new, this expression language syntax. 00:05:43.320 |
we run invoke and we pass a dictionary of input variables into that. 00:05:47.880 |
So we run this and yeah, it's going to do the exact same thing. 00:05:54.080 |
We are very similar output to what we saw before. Okay. 00:05:58.080 |
So it gives us a little report again. Okay. Looks cool. 00:06:06.640 |
doing the exact same thing, just different syntax. 00:06:10.200 |
Now I think when you see that syntax with the pipe operator for the first time, 00:06:17.000 |
I was quite confused and I think most people would be confused. 00:06:22.920 |
At least the idea behind how it works can be explained very easily. 00:06:27.600 |
What we see on the left of each pipe operator, 00:06:32.160 |
the output from that gets passed to what is on the right, the pipe operator. 00:06:37.080 |
Okay. And then the output from this is passed into this. So it's, 00:06:42.320 |
it's literally piping things from the left of the pipe operators all the way 00:06:47.400 |
through to the right of the pipe operators. That's, 00:07:02.840 |
It's probably a little bit hacky in my opinion, but it's, it's kind of interesting. 00:07:06.760 |
So this pipe operator, when we apply it to an object in Python, 00:07:11.680 |
what it actually looks for within the object is this or method here. 00:07:20.280 |
we have this kind of confusing class called runnable, 00:07:24.360 |
but let's break it down a little bit. Okay. So I'm going to do class. 00:07:27.640 |
I'm going to call it, I'm still going to call it runnable. 00:07:34.640 |
I'll see the init method here. And within that, 00:07:40.360 |
Because the way that we're going to implement this is we're going to give a 00:07:46.960 |
And we're going to use this class to transform this function into something that 00:07:58.080 |
our runnable class or object. And then the next thing you see, 00:08:03.040 |
this is the part that makes the, the pipe operator work. Okay. 00:08:08.000 |
So when a pipe operator is applied to it, an object, 00:08:11.360 |
it's going to look for the objects or method. Now, 00:08:20.440 |
function that we call other here. Now, the way that you can think of this, 00:08:24.480 |
the funk and the other arguments here is that funk is 00:08:31.480 |
And other is what is on the right about pipe. Okay. 00:08:35.480 |
So what we do is we create this chain function here, 00:08:40.760 |
which is going to consume a set of arguments and keyword arguments. 00:08:44.320 |
So you can call it chained funk as we do there, 00:08:52.480 |
Now the reason that we set up with args and keyword arguments like this is 00:08:55.960 |
because we don't know the names of the parameters that are going to be input 00:09:00.400 |
into our function. All right. So by doing this, we can, you know, 00:09:07.080 |
We can have more or less and this chain function will be able to handle 00:09:19.920 |
that consumes the output from our function. Okay. 00:09:24.600 |
And again, that function is going to take the, 00:09:32.800 |
So from that, we would then return the, the runnable here. 00:09:37.520 |
So this is going to be our like runnable version of that chain 00:09:52.960 |
within each one of the functions that we pass through this actual chain. Okay. 00:09:57.560 |
So we can do multiple of these. So we could have, you know, 00:10:06.840 |
the final thing that we need to have here is a method that 00:10:21.360 |
I think they use invoke. So rather than call, 00:10:25.400 |
they would have invoke here and that starts the chain, 00:10:29.240 |
but I'm just going to do call because I think it's simpler. 00:10:32.520 |
So that is our runnable function. We can run that. 00:10:36.440 |
And I also have it here. Maybe I'll just run this one. 00:10:40.000 |
And what we want to do is use this runnable to kind of wrap around different 00:10:44.760 |
functions that we would like to run with this pipe operator approach to do that. 00:10:49.200 |
We're going to define two very simple functions here. 00:10:53.360 |
One is that five one is multiplied by two. Okay. 00:10:56.560 |
So let's run those and I'm going to wrap those with this runnable 00:11:00.960 |
object that we've created and then using this 00:11:10.440 |
the chain we're going to do add five and then rather than using the pipe 00:11:15.200 |
operator, I'm going to use the, the all method directly. 00:11:17.880 |
And then within that all method, I'm going to pass out, 00:11:24.960 |
So we have those and then we can just call our chain. 00:11:28.560 |
So pass three to it and we get a value 16, which is, that's great. 00:11:37.160 |
Those gives us eight and multiply those by two. 00:11:41.840 |
Okay. So it's correct. It's running the correct order. 00:11:45.000 |
Now we can use this syntax or now that we use this 00:11:51.160 |
we can also use the syntax that we see here with the pipe operator. 00:11:55.680 |
So let's try that. Okay. And you can see that we, we now have this. 00:12:00.560 |
So that's, that's pretty interesting. So we can, you know, 00:12:04.720 |
we can build our own pipe operator functions using, 00:12:08.880 |
using this and this is what line chain is doing. Okay. 00:12:12.480 |
So when we see this line chain expression language, 00:12:17.400 |
which is an interesting way of putting things together. 00:12:22.600 |
Let's have a look at how we actually use the expression language itself. 00:12:33.400 |
Now let's put it together in an actual use case. 00:12:36.440 |
So I'm going to be using the Cohere embedding model. 00:12:40.560 |
You know, if you can also use OpenAI's embedding model, it's up to you, 00:12:45.920 |
but to get that API key, I don't think there's a wait list for Cohere. 00:12:49.680 |
So you can, you should be able to jump straight into it. 00:12:51.720 |
You can go to dashboard.cohere.com. You'd go to API keys. 00:12:58.560 |
you can create either a trial key or production key and you just use that. 00:13:02.640 |
So I'm going to add mine in here and I'm going to be using the Cohere embedding 00:13:09.560 |
which is very high performance embedding model. 00:13:12.640 |
I'm going to be using that to create two kind of like document stores that we 00:13:22.720 |
they're very small. It's just for an example. 00:13:24.960 |
We have one where we have half the information in vector store document store A 00:13:29.520 |
and half the information in vector store or doc store B. You'll see why soon, 00:13:34.360 |
but for now, what we're going to do is use the first one. Okay. 00:13:41.320 |
So it contains information about me when my birthday is. 00:13:44.920 |
The other one contains the year of my birthday. 00:13:53.320 |
putting information into the vector store or retrieving 00:14:00.120 |
and then feeding that alongside the original query into 00:14:04.720 |
a chain using the expression language. Now, when we do this, 00:14:09.520 |
there's one important thing that we need to be aware of, 00:14:21.360 |
we have like one input and one output to each of these 00:14:26.200 |
items, right? Each of these components. So how, you know, 00:14:32.200 |
we have a context that we need to use here and also a question that we need to 00:14:37.680 |
And the way that we do that is by using this runnable parallel 00:14:45.120 |
We have runnable parallel and runnable pass through the runnable parallel, 00:14:56.720 |
components in parallel and also extract multiple 00:15:00.760 |
values from them, right? So here we're going to run retriever a, 00:15:12.480 |
What runnable pass through does is whatever was input into the 00:15:25.080 |
So it's literally a pass through for values that you pass into here. 00:15:33.360 |
So we have our retriever a here that we're using. 00:15:37.160 |
We have our prompt template, so on and so on, right? 00:15:40.760 |
We have our retrieval that happens first. So we have a query. 00:15:44.280 |
When was, when was I born? We're going to invoke that. 00:15:48.640 |
And this value is being passed into our retriever. 00:15:55.400 |
It's also being passed to here and going straight through to our prompt. Okay. 00:15:59.400 |
So then our prompt gets formatted with the question we have, 00:16:03.400 |
when was James born with the context? We have the record. 00:16:07.280 |
We will have the records from here. Okay. So they saw a, 00:16:18.680 |
unfortunately I do not have enough context to definitively state when James is 00:16:26.000 |
It found this little bit of information. So it knows that my birthday is then, 00:16:29.400 |
but it does not specify the year that I was born. Okay. 00:16:32.840 |
So it can't actually fully answer the question, 00:16:39.920 |
It's going to do retrieval component. It's in our prompt model, Apple parser, 00:16:45.920 |
Now the cool thing with runnable parallel is you might have 00:16:50.520 |
guessed with what we have here is that it can run many things in 00:16:55.320 |
parallel, not just a retriever and, you know, passing through a question. 00:16:59.560 |
We can actually run multiple retrievers in parallel, 00:17:02.560 |
or we can run multiple different components in parallel at the same time. 00:17:06.680 |
And this is one of the things that is very cool about the expression language is 00:17:12.680 |
we set these things up in parallel and like runnable parallel here, 00:17:17.080 |
it's just going to do them in parallel, right? 00:17:21.680 |
We don't have to deal with building or writing any of that code ourselves, 00:17:26.080 |
which is, I think, pretty cool. So let's come down to here. 00:17:30.800 |
What I'm going to do is now that we're going to be retrieving information from 00:17:35.200 |
two places, I'm going to create a context A and a context B. 00:17:39.440 |
We're going to run that, or we're going to initialize the prompt, 00:17:42.880 |
then our runnable parallel. Now we need to modify a little bit. 00:17:49.680 |
we're now mapping that to context A and we have retriever B, 00:17:53.720 |
which we're going to map over to context B. And then as before, 00:17:57.200 |
we have our question, which is the runnable pass-through. 00:18:05.040 |
same. We still just have one like retrieval component there now, 00:18:10.720 |
both our retrievers are being run in parallel within that abstraction. 00:18:14.240 |
So we're going to run that. And now I'm going to say the same, the same question. 00:18:18.640 |
When was I born? Okay. So now it, it knows based on the context provided, 00:18:27.560 |
It's a second document, the page content, James born in 1994. 00:18:32.880 |
And maybe if I want to kind of say, okay, give me the date as well. 00:18:43.680 |
And we actually get this, which is odd because so it says, 00:18:50.080 |
the given context does not provide definitive information to answer the question. 00:19:03.120 |
I don't know that there's a little bit of a lack of reasoning ability with 00:19:11.000 |
So my birthday is 7th of December and I was born in 1994. 00:19:16.640 |
I don't know why it's kind of surprising to me that I didn't get that, 00:19:22.400 |
but at least we can see that our chain is working correctly. 00:19:25.760 |
We can see that it's pulling in information from both our retrievers there, 00:19:30.760 |
And we're almost done with what I think are the essentials of the expression 00:19:36.640 |
There's just one more thing that I think is super important and it's basically 00:19:41.520 |
the line chains abstraction of doing what I showed you earlier, 00:19:45.000 |
where we created our own sort of runnable class and fed functions into it to 00:19:49.760 |
create these things that we can run with the pipe operator. 00:19:53.320 |
So to do that in line chain, they have these runnable lambdas. Okay. 00:20:00.280 |
I called that class a runnable because here they call them runnable lambdas. 00:20:04.320 |
So we have our add five and our multiply by two. 00:20:09.800 |
I'm going to just come up here and show you what we had earlier. So yeah, 00:20:14.040 |
we have these two functions. Let's take those. Okay. 00:20:18.800 |
So what we're doing before so that we could use this, let's do it again here. 00:20:25.560 |
All right. So we have our add five and our multiply by two. Let's run this. 00:20:29.720 |
This time we're doing runnables, but we're just doing them through line chain. 00:20:33.480 |
So our chain is going to be add five multiply by two, as we did before. 00:20:43.720 |
we have to use invoke rather than just calling the object directly. 00:20:48.400 |
So we run that and yes, as before we get 16. So yeah, 00:20:54.080 |
we can wrap our own functions using line chains, 00:20:58.080 |
runnable lambda here. Now, when would we use that? I mean, there, 00:21:02.400 |
there are definitely different scenarios where I might want to use that, 00:21:06.560 |
but let me just show you something here, which, you know, 00:21:12.640 |
And it's a good example of where we might want to use this either use this, 00:21:16.800 |
or we probably want to adjust the output parser as well. So we have, 00:21:26.920 |
some like leading white space here that we could do removing, 00:21:34.120 |
Here's a short fact about artificial intelligence. 00:21:36.800 |
And then we have two double new line characters. 00:21:39.000 |
Maybe I don't want that. And I just want it to get straight to the fact. 00:21:43.120 |
So what I can do is use this runnable lambda abstraction 00:21:48.680 |
to, to do that, right? So I'm going to define a function, 00:21:52.320 |
which is going to look within the string for a double new line within the 00:22:02.640 |
And we're going to take everything that occurs after the double new lines. 00:22:07.360 |
Now, in the case that maybe there are multiple double new lines, 00:22:14.640 |
one from one to the end of the list that we would get from this. 00:22:18.160 |
And then we're joining everything back here. Okay. 00:22:20.880 |
So we're basically just dropping that first one, the first part here. 00:22:24.520 |
So let's run that. I'm going to wrap that within a runnable lambda, 00:22:28.360 |
and then I'm going to pull all those things together. 00:22:31.640 |
And I'm going to add the get fact runnable to the end of my chain. 00:22:36.600 |
Now let's invoke again and see what we get. Okay. 00:22:47.320 |
And you know, we see with both of those, it, you know, it works. 00:22:51.000 |
So our little runnable lambda here worked well. 00:22:55.240 |
Okay. So that is really everything I wanted to cover with the 00:23:00.720 |
expression language. You know, I think there's, 00:23:03.600 |
there's other things that we can talk about and more to cover, but this is, 00:23:07.920 |
I think pretty much everything you need to really get started with it and just 00:23:13.240 |
understand what this abstraction is actually doing, which like I said, 00:23:18.080 |
it's not, it's important to understand because then at least we know what we're 00:23:22.440 |
doing rather than just kind of putting in these pipe operators 00:23:26.920 |
and kind of thinking they should work when maybe we're doing something that 00:23:32.160 |
So I hope this has been useful for understanding the expression language. 00:23:35.920 |
You know, there's pros and there's cons to using this. Now on the pros, 00:23:40.520 |
obviously there's the minimalist style of the code, which is kind of nice. 00:23:44.280 |
It's very clean and the out of the box support for different features like 00:23:49.120 |
streaming and the parallel execution that we saw, 00:23:54.920 |
there's plenty of people that are less fond of the expression language as you 00:23:59.560 |
know, it's a big change. It's to be expected. 00:24:02.200 |
Now the things that people point to when they're like, 00:24:05.960 |
this doesn't make sense is that it makes things more abstract. 00:24:10.440 |
Lang chain is a, you know, it's abstractions already. 00:24:13.600 |
So we're kind of adding another abstraction to the abstractions 00:24:20.480 |
it's definitely not common syntax for Python and that kind of goes against the 00:24:27.240 |
which is that we kind of shouldn't make special cases for things. 00:24:31.200 |
And of course it's a new syntax. It's especially when you first look at it, 00:24:35.160 |
I think once you've explored it a little bit, it makes sense. 00:24:37.840 |
But when you first get started with it, it's, it's definitely confusing. 00:24:44.560 |
I think both of those viewpoints are entirely valid. 00:24:48.080 |
There's pros and cons for, for sure. But I like it. I think it, 00:24:53.880 |
it's definitely worth learning and experimenting with, 00:24:56.640 |
and it can definitely speed things up when it's particularly when you're 00:25:00.360 |
prototyping and maybe in production code, you know, 00:25:05.120 |
it's going to depend on what you're wanting to, what you're wanting to do that. 00:25:11.960 |
I hope all this has been useful in understanding the expression language. 00:25:16.440 |
So thank you very much for watching and I will see you again in the next one.