LangChain Expression Language (LCEL) Explained!

Today we're going to be talking about the Lang chain expression language, which is a pretty interesting idea that essentially allows us to write very minimalist code to build chains within Lang chain. And for sure, I think we'll see from this video, we can use a lot of Lang chains more advanced features like parallel execution, async and streaming very easily using the expression language rather than just the more typical approach to building Lang chain chains.

And in my opinion, it's worth trying just for that. I think we'll see that just using this, you can build stuff very quickly. That's not to say it doesn't have its cons, but we'll dive into those later. So let's just begin with what this expression language actually is. So there's a page here in the Lang chain docs talking about this expression language, right?

So it's L C E L for short. And yeah, they just explain a few things, you know, your streaming async parallel execution, so on and so on. Right. But let's just jump into this notebook and we'll see more of how this actually works. So there'll be a link to this notebook as they usually is at the top of the video right now.

And I've written all this in Colab, so you can do the same. It's pretty straightforward. We have a few prerequisites. We're going to be using Lang chain, of course. We're going to be using Anthropic, the new Claude 2.1 model for our LLM. We're going to be using Cohere for the embeddings, and we're going to be using a docker array, just so I can give you an example of parallel retrieval later on, which is super interesting.

Now, the main things that I think we want to use the expression language for is these three items here. So we have super fast development of chains. We have those advanced features, streaming async parallel execution, just work out of the box with these super fast and easy to set up.

And there's also easy integration with the other Lang chain products. So Langsmith and Langserv, if you are using those. Now let's take a look at what it actually looks like. So to get started with this, we're going to need a Anthropic API key, and you can get that by going to console.anthropic.com.

You'd come into here. Hopefully you have an account already and you can click get API keys. And you're just going to get your API keys from that. If you don't have an Anthropic account, I think there's still like a very minor wait list. So one, I just recommend you sign up and you you'll get access pretty soon, but so that you're not waiting, you can also just use OpenAI.

So you would just swap chat on Anthropic here with chat, OpenAI and swap Anthropic API key for OpenAI API key. And if you do that, you will also want to drop, just drop these two arguments and make things easier. So looking at this, let's see, we'll put our API key in here.

And once we have that, we now have these three components. We have a prompt, a model, which is a chat model and a output parser. Okay. Now in typical line chain, we would chain these together using the LLM chain. Okay. So you can see LLM chain or your prompt, the LLM and the output parser.

Okay. What I'm going to do is take this prompt. We're asking to give me a small report about a particular topic. Okay. So the, the input to that is going to be topic. And you can see that here. So we have topic artificial intelligence, and it's obviously just going to output a small report on that.

Okay. So let's run that and see what we get. So it's running, uh, we create our chain, run chain dot run, and we'll just print that out. But, and we get this small like report thing on, on AI. Okay. So, oh, it looks pretty good. Now, how would we do that with the expression language?

Well, we use this, this pipe operator, and I'm going to go into detail as to how this actually functions, because I think that's understanding how this pipe operator functions allows us to just understand what is actually happening here. Okay. So that we can actually understand this abstraction rather than just blindly using it.

So we string things together, right? So we have a prompt followed by the model followed by output parser, rather than putting them into an LLM chain or some other chain, we just string them together with this pipe operator. So, I mean, it's like, for sure, if I look at this, it's kind of, it's simpler than this, right?

If you compare those two, it's, I would say also more flexible because we can just string things together, but it's, you know, I think it's, it's not so Pythonic as to what we're used to, whether or not that is a good or bad thing I'm undecided on. Like I really, I like the minimalist approach here.

It looks great, but it's maybe hard to understand. Like if you, if you don't understand the syntax and you understand Python very well, this can be pretty confusing. Anyway, let's run that. So we create our chain using this new, this expression language syntax. And then we just, rather than running run, we run invoke and we pass a dictionary of input variables into that.

So we run this and yeah, it's going to do the exact same thing. We are very similar output to what we saw before. Okay. So it gives us a little report again. Okay. Looks cool. So these two things, this and this, doing the exact same thing, just different syntax.

Now I think when you see that syntax with the pipe operator for the first time, at least to me, I was quite confused and I think most people would be confused. The way that it works is pretty simple. At least the idea behind how it works can be explained very easily.

What we see on the left of each pipe operator, the output from that gets passed to what is on the right, the pipe operator. Okay. And then the output from this is passed into this. So it's, it's literally piping things from the left of the pipe operators all the way through to the right of the pipe operators.

That's, that's all it's really doing. Now how that pipe operator actually works is more, not necessarily complicated. It's probably a little bit hacky in my opinion, but it's, it's kind of interesting. So this pipe operator, when we apply it to an object in Python, what it actually looks for within the object is this or method here.

Right? So I come down to here, we have this kind of confusing class called runnable, but let's break it down a little bit. Okay. So I'm going to do class. I'm going to call it, I'm still going to call it runnable. Now, when we initialize this class, we run, I'll see the init method here.

And within that, we're going to pass a function, right? Because the way that we're going to implement this is we're going to give a function into this class. And we're going to use this class to transform this function into something that we can use this pipe operator on. So we want to save that function within our runnable class or object.

And then the next thing you see, this is the part that makes the, the pipe operator work. Okay. So when a pipe operator is applied to it, an object, it's going to look for the objects or method. Now, the or method that needs to contain another function that we call other here.

Now, the way that you can think of this, the funk and the other arguments here is that funk is kind of what is on the left of our pipe. And other is what is on the right about pipe. Okay. So what we do is we create this chain function here, which is going to consume a set of arguments and keyword arguments.

So you can call it chained funk as we do there, arguments and we have our keyword arguments. Now the reason that we set up with args and keyword arguments like this is because we don't know the names of the parameters that are going to be input into our function.

All right. So by doing this, we can, you know, those parameter names can vary. We can have more or less and this chain function will be able to handle those. So we would do return other. So I'll basically this function here, that consumes the output from our function. Okay.

And again, that function is going to take the, whose args and keyword arguments. Okay. So from that, we would then return the, the runnable here. So this is going to be our like runnable version of that chain function. So basically by doing that, we're putting the disability to run chains within each one of the functions that we pass through this actual chain.

Okay. So we can do multiple of these. So we could have, you know, other two of the three, so on and so on. Now, the final thing that we need to have here is a method that allows us to call and begin this chain. Now I'm going to implement it with this.

We will see that line chain actually uses, I think they use invoke. So rather than call, they would have invoke here and that starts the chain, but I'm just going to do call because I think it's simpler. So that is our runnable function. We can run that. And I also have it here.

Maybe I'll just run this one. And what we want to do is use this runnable to kind of wrap around different functions that we would like to run with this pipe operator approach to do that. We're going to define two very simple functions here. One is that five one is multiplied by two.

Okay. So let's run those and I'm going to wrap those with this runnable object that we've created and then using this approach, right? So we have, uh, we have, the chain we're going to do add five and then rather than using the pipe operator, I'm going to use the, the all method directly.

And then within that all method, I'm going to pass out, multiply it by two runnable. Okay. So we have those and then we can just call our chain. So pass three to it and we get a value 16, which is, that's great. So we do three plus five, take both.

Those gives us eight and multiply those by two. Okay. So it's correct. It's running the correct order. Now we can use this syntax or now that we use this all method, we can also use the syntax that we see here with the pipe operator. So let's try that. Okay.

And you can see that we, we now have this. So that's, that's pretty interesting. So we can, you know, we can build our own pipe operator functions using, using this and this is what line chain is doing. Okay. So when we see this line chain expression language, this is what we're actually looking at, which is an interesting way of putting things together.

Now that's how it works. Let's have a look at how we actually use the expression language itself. So we saw already we can use the, the operators or the pipe operators. Now let's put it together in an actual use case. So I'm going to be using the Cohere embedding model.

You know, if you can also use OpenAI's embedding model, it's up to you, but to get that API key, I don't think there's a wait list for Cohere. So you can, you should be able to jump straight into it. You can go to dashboard.cohere.com. You'd go to API keys.

And from the API keys page, you can, you can create either a trial key or production key and you just use that. So I'm going to add mine in here and I'm going to be using the Cohere embedding model. So the newest one from now, which is very high performance embedding model.

I'm going to be using that to create two kind of like document stores that we have here. Okay. So we have, you know, they're very small. It's just for an example. We have one where we have half the information in vector store document store A and half the information in vector store or doc store B.

You'll see why soon, but for now, what we're going to do is use the first one. Okay. So we're going to use A. Right? So it contains information about me when my birthday is. The other one contains the year of my birthday. So let's try putting information into the vector store or retrieving information from my base.

So, and then feeding that alongside the original query into a chain using the expression language. Now, when we do this, there's one important thing that we need to be aware of, which is when we use this syntax, just using this syntax and nothing else, we, we have like one input and one output to each of these items, right?

Each of these components. So how, you know, how does that work when we have, you know, we have a context that we need to use here and also a question that we need to feed into our prompt. And the way that we do that is by using this runnable parallel object.

So I've imported those here. We have runnable parallel and runnable pass through the runnable parallel, which we have here first, it allows us to run multiple chains or components in parallel and also extract multiple values from them, right? So here we're going to run retriever a, and then for this question, we're using this runnable pass through item.

What runnable pass through does is whatever was input into the retrieval or the runnable parallel object, it's just trying to return that. Okay. So it's literally a pass through for values that you pass into here. So let's run all of that. Okay. So we have our retriever a here that we're using.

We have our prompt template, so on and so on, right? We have our retrieval that happens first. So we have a query. When was, when was I born? We're going to invoke that. And this value is being passed into our retriever. It's doing a search, getting the context. It's also being passed to here and going straight through to our prompt.

Okay. So then our prompt gets formatted with the question we have, when was James born with the context? We have the record. We will have the records from here. Okay. So they saw a, so my birthday, the actual date. Now what we will get here is, unfortunately I do not have enough context to definitively state when James is born.

And it tells me what it found. It found this little bit of information. So it knows that my birthday is then, but it does not specify the year that I was born. Okay. So it can't actually fully answer the question, but we can see that this chain is working.

It's going to do retrieval component. It's in our prompt model, Apple parser, whatever else it's going through everything. Now the cool thing with runnable parallel is you might have guessed with what we have here is that it can run many things in parallel, not just a retriever and, you know, passing through a question.

We can actually run multiple retrievers in parallel, or we can run multiple different components in parallel at the same time. And this is one of the things that is very cool about the expression language is that it, you know, we set these things up in parallel and like runnable parallel here, it's just going to do them in parallel, right?

It's going to run those in parallel. We don't have to deal with building or writing any of that code ourselves, which is, I think, pretty cool. So let's come down to here. What I'm going to do is now that we're going to be retrieving information from two places, I'm going to create a context A and a context B.

We're going to run that, or we're going to initialize the prompt, then our runnable parallel. Now we need to modify a little bit. We need to add. So we have retriever A, we're now mapping that to context A and we have retriever B, which we're going to map over to context B.

And then as before, we have our question, which is the runnable pass-through. Now the chain itself is exactly the same. We still just have one like retrieval component there now, because, you know, both our retrievers are being run in parallel within that abstraction. So we're going to run that.

And now I'm going to say the same, the same question. When was I born? Okay. So now it, it knows based on the context provided, James was born in 1994. Okay. It's a second document, the page content, James born in 1994. And maybe if I want to kind of say, okay, give me the date as well.

I'd say what date exactly was James born. And we actually get this, which is odd because so it says, unfortunately, the given context does not provide definitive information to answer the question. What day exactly was James born? But then, then it actually, it gives us here. So yeah, I don't know that there's a little bit of a lack of reasoning ability with Claude in this case, clearly.

So my birthday is 7th of December and I was born in 1994. I don't know why it's kind of surprising to me that I didn't get that, but interesting, but at least we can see that our chain is working correctly. We can see that it's pulling in information from both our retrievers there, which is cool.

And we're almost done with what I think are the essentials of the expression language. There's just one more thing that I think is super important and it's basically the line chains abstraction of doing what I showed you earlier, where we created our own sort of runnable class and fed functions into it to create these things that we can run with the pipe operator.

So to do that in line chain, they have these runnable lambdas. Okay. And this is why earlier on, I called that class a runnable because here they call them runnable lambdas. So we have our add five and our multiply by two. I'm going to just come up here and show you what we had earlier.

So yeah, we have these two functions. Let's take those. Okay. We can see a runnable. So what we're doing before so that we could use this, let's do it again here. All right. So we have our add five and our multiply by two. Let's run this. This time we're doing runnables, but we're just doing them through line chain.

So our chain is going to be add five multiply by two, as we did before. And as I mentioned, you know, line chain, we have to use invoke rather than just calling the object directly. So we run that and yes, as before we get 16. So yeah, we can wrap our own functions using line chains, runnable lambda here.

Now, when would we use that? I mean, there, there are definitely different scenarios where I might want to use that, but let me just show you something here, which, you know, kind of bothers me a little bit. And it's a good example of where we might want to use this either use this, or we probably want to adjust the output parser as well.

So we have, let's run both of these. What we see when we run this is on this, some like leading white space here that we could do removing, but it also starts each answer with this. Here's a short fact about artificial intelligence. And then we have two double new line characters.

Maybe I don't want that. And I just want it to get straight to the fact. So what I can do is use this runnable lambda abstraction to, to do that, right? So I'm going to define a function, which is going to look within the string for a double new line within the string.

If that is in there, we're going to split by double new lines. And we're going to take everything that occurs after the double new lines. Now, in the case that maybe there are multiple double new lines, we're taking everything, you know, one from one to the end of the list that we would get from this.

And then we're joining everything back here. Okay. So we're basically just dropping that first one, the first part here. So let's run that. I'm going to wrap that within a runnable lambda, and then I'm going to pull all those things together. And I'm going to add the get fact runnable to the end of my chain.

Now let's invoke again and see what we get. Okay. So there's no weird starting texts here. And you know, we see with both of those, it, you know, it works. So our little runnable lambda here worked well. Okay. So that is really everything I wanted to cover with the expression language.

You know, I think there's, there's other things that we can talk about and more to cover, but this is, I think pretty much everything you need to really get started with it and just understand what this abstraction is actually doing, which like I said, it's not, it's important to understand because then at least we know what we're doing rather than just kind of putting in these pipe operators and kind of thinking they should work when maybe we're doing something that doesn't make sense.

So I hope this has been useful for understanding the expression language. You know, there's pros and there's cons to using this. Now on the pros, obviously there's the minimalist style of the code, which is kind of nice. It's very clean and the out of the box support for different features like streaming and the parallel execution that we saw, but there are also some cons and you know, there's plenty of people that are less fond of the expression language as you know, it's a big change.

It's to be expected. Now the things that people point to when they're like, this doesn't make sense is that it makes things more abstract. Lang chain is a, you know, it's abstractions already. So we're kind of adding another abstraction to the abstractions and that the syntax is, it's definitely not common syntax for Python and that kind of goes against the Zen of Python, which is that we kind of shouldn't make special cases for things.

And of course it's a new syntax. It's especially when you first look at it, I think once you've explored it a little bit, it makes sense. But when you first get started with it, it's, it's definitely confusing. So in my honest opinion, I think both of those viewpoints are entirely valid.

There's pros and cons for, for sure. But I like it. I think it, it's definitely worth learning and experimenting with, and it can definitely speed things up when it's particularly when you're prototyping and maybe in production code, you know, it's going to depend on what you're wanting to, what you're wanting to do that.

Anyway, that's it for this video. I hope all this has been useful in understanding the expression language. So thank you very much for watching and I will see you again in the next one. Bye. you you

LangChain Expression Language (LCEL) Explained!

Chapters

Transcript