Back to Index

Chat with OpenAI in LangChain - #5


Chapters

0:0 LangChain's new Chat modules
2:9 New LangChain chat in Python
3:14 Using LangChains ChatOpenAI object
4:36 Chat messages in LangChain
6:43 New chat prompt templates
9:5 LangChain human message prompt template
13:18 Using multiple chat prompt templates
17:42 F-strings vs. LangChain prompt templates
19:23 Where to use LangChain chat features?

Transcript

With the introduction of OpenAI's new ChatGPT endpoint, the LangChain library have very quickly, unsurprisingly, added a ton of new support for chat. The reason for this is that, unlike previous large language model endpoints, the new ChatGPT endpoint is slightly different. It takes multiple inputs, and therefore, with LangChain, this new sort of approach to calling large language models has been supported with its own set of objects and functions.

So the new ChatCompletion endpoint from OpenAI, it differs in the typical large language model endpoints in that you can essentially pass in three types of inputs that are defined or distinguished by these three different role types. These three different role types are system, user, and assistant. The system or system message acts as the initial prompt to the model in order to set up its behavior for the rest of the interaction.

So for example, with ChatGPT, what you would find, before we even write anything, OpenAI have already passed in a system message to ChatGPT, kind of telling it how to behave. Then after that, we have the user messages. So user messages is like what we write, okay? So in ChatGPT, we write something that's a user message.

And then the other one is the assistant message. Those are the responses that we get from ChatGPT, okay? So the assistant is what ChatGPT is producing. Now, when we use the endpoint, for every new interaction, we're feeding in a history of previous interactions as well. So we're always gonna have that system message at the top.

We're going to have a user message followed by an assistant message, followed by user message, and so on and so on. So there is some difference with this new endpoint. And therefore, how we interact with ChatGPT via LangChain is also different. So let's just jump straight into it. Okay, so we get started with a pip install.

Here we're doing LangChain and OpenAI. There's only two libraries we use for this. Once those have been installed or updated, okay, so this is the latest version of OpenAI and LangChain. So you do need to update if you haven't updated them very recently. Now, what we'll do is start by initializing the chat OpenAI object.

For that, we do need an OpenAI API key. So you can click this link. There will be a link to this notebook, so you can follow along at the top of the video somewhere right now. But this will take us across to this page here. So this is platform.openai.com.

And what we do is we go to view API keys. We go to here, create a new secret key, and then you copy that secret key. And what you do is run this cell. And you can see at the top here, it says, tells me OpenAI API key. If you're on Colab, it will appear just below the cell.

And you just paste your API key into there. Okay, that stores the API key into here. And then we come down here. And what we're going to do is initialize the chat OpenAI object. So for this, we're going to be using the chat GPT model. Now, by using this, we're essentially going to default to the latest version of chat GPT.

So right now, the latest version is actually this here. Okay, so if you want to follow this video and the exact same responses in the future, you need to write this. But I will leave it like this. Basically, as they release new versions of this model, this will just default to the latest one.

Now, when setting temperature to zero, that would make the completions fully deterministic as far as I could tell. So like running the same prompt twice, you'll get the same output. Now, we've seen this. So the chats with chat GPT are kind of structured like this. So we have system, user, assistant, user, assistant.

That final empty assistant prompt there is kind of telling the model, like now it's your time to respond, right? So the model is just completing the end of this conversation. And the way that we format that is like this, okay? In LineChain, they kind of mirror this format. It's very similar, but slightly different.

So we have these system message objects, a human message object, and an AI message object. So to create this up here, we would write this, okay? So we have these messages, and it's just a list of these, okay? In the order that they have been passed in the conversation.

Okay, so we're just passing, we are stopping user for human message and assistant for AI message. Assistant message is still system message. And let's run this, okay? And let's run this. So this is going to generate a response from the chat GPT model, right? And I get this. So we have AI message, it's pretty long.

So what we can do is just print it out, and we get this, it's still pretty long, but we can go along like so. All right, cool. Now, if we take a look up here at the initial response before printing out the response content, come to the start and we can see that it's an AI message.

So it's the same type of object as this here. So that means that we can actually just append this AI message, our response, directly to messages here, and that will create the full conversation, including the latest response, all right? So that's what we're doing here. And then from there, we can just continue the conversation.

So we will create a new human message prompt, we'll add that to our messages, and then we'll send all of those to chat GPT. Okay, so now what was the next question I asked? Why do physicists believe it can produce a unified theory? This is talking about string theory up here.

And then it goes in and starts explaining that they believe that string theory has potential to produce a unified theory, because so on and so on. Okay, cool. Now, that is, I suppose, a core functionality of Lionchain's new chat features, but there are a few other things that they've introduced alongside these.

So we have a few new prompt templates. So these new prompt templates, we have like a AI message, human message, and system message prompt template. And these are kind of just an extension of the original prompt templates in Lionchain. But when you use them, you have a couple of functions that will allow you to create your prompt template and output it as a system message, AI message, or user message.

And you can also kind of like link them all together to create a list of messages that you then just pass straight into your chat endpoint. Now, I'm not super aware of like a huge number of reasons to use these right now, but these are part of the new features in Lionchain for chat.

So I figure it is important to share these. And if it seems like something that would actually help you with whatever it is you're building, then that's great. You now know how, or you will know how to use them. So we'll come down to here. What I'm doing is I'm making sure I'm using the March model here.

So we're going to set up our first system message, and we're going to create a human message, all right, our first input. Now, within this system message, I'm saying I want the responses to be no more than 100 characters long, including white space. And I want it to sign off every message with a random name like robot or Barbara, okay?

We're just giving it tasks to do to see how well it follows these instructions. So run this, and now we make our first completion from this, and let's see how it does with those instructions. Okay, so the length is way out. Like we asked for 100 at maximum, it's 154.

And it also didn't give us a sign off there as well. Now, this is kind of just an issue with the current version of ChatGPT. Okay, so with this version here. It's not very good at following system messages, apparently. It's kind of better to pass these instructions into your human message.

But we might not want a user to have to specify these things. So maybe this is where we can use one of these prompt templates. So let's try. What we're gonna do is for every human message, we're gonna pass it into here, right? So we had that question before.

Hi AI, how are you? What is quantum physics? We'd pass that into input here. And what I'm going to do is after the question, I'm gonna say, can you keep the response to no more than 100 characters, including white space, and sign off with a random name? So we create our prompt like this.

So we have this LangChain prompts chat, and we have human message prompt template. And we also need to use this chat prompt template. I feel like this is a little bit convoluted at the moment, but this is just how it is. So we're gonna go through it anyway. So we have human message from template, and we're gonna have this, okay?

This is just like a typical prompt template in LangChain. Then once we have that human template, we need to pass it to this chat prompt template and from messages, right? And then in there, we pass in like a list of whatever messages we want, right? So I will give you another example soon, but we can also pass multiple messages here, like system message, human message, AI message, and so on, which I found some way of kind of using that.

So, I mean, I think that's kind of interesting at least. So we format that with some input. So we pass in this input here, how AI, how you, what is quantum physics, and let's see what we get from that. So we get this chat prompt value object, and it has messages, a list of messages in there.

First message, and the only message is, hi AI, how are you, what is quantum physics, right? So that's our input. And then we have, can you keep the response to no more than 100 characters, including white space, sign off, so on and so on, right? So that is our template that is being applied based on this.

All right, cool. Now, we come down to here, and to use our human message prompt template as a typical message or human message, we actually need to use this here, right? So we take our chat prompt value, which we created here and we can see here, and we can either pass it as two messages, that will give us the format that we need in order to pass it to chat GPT, or we can just create a string out of it, okay?

So this would, I suppose, be pretty much the same as using an F string. The only thing that's added onto there is we have this human, right? Otherwise, it's literally just taking this and converting it into a string. Okay, so let's see if this approach works. Here, I'm just kind of throwing it all together.

So we have the chat prompt, the input, hi, hi, how are you doing? That's going to create this, and then I'm going to convert two messages and take the first message, which is the only message in there, which is essentially going to give us this human message. Okay, and did I, can you keep the response to no more than 100 characters?

And then here, I put 60 characters. So maybe I just put 100 here, and we'll try 60 later as well. So let's run that. All right, so you can see now it's listening. So we said 100 characters here, didn't really work, but then we did, we've also added it into this user or human message here, and now it's sticking to that, right?

So length is good, let's keep going, and we also have this signed off with bot route. So that is working by adding those instructions into the user message, we're getting better results. Okay, cool. In my last attempt, I actually got slightly over the character limit, apparently. So I mean, we can run this again, and okay, so we've set temperature to zero here, and because of that, we would expect the output to be the same every single time.

So it's deterministic. So quantum physics is very small scale. I think it's every time it's outputting the same. Okay, cool, and then let's continue with this. So I want to show you, we can use this prompt templating method in order to build a initial set of messages that we can basically use as like examples, like few shot training for our chat model.

So what we can do like here, we've done 100 characters, right? Maybe we can go even lower, but maybe in that case, we might need to give some examples to the system, right? So let's do that. We're going to have this character limit, and we're going to have this sign off inputs or variables.

For the human message, we're just going to pass in the input there, right? So for this first one, we're not going to pass in those instructions, because we're actually going to create this human message, and we're also going to create the following AI message as an example to the chatbot as to how it should respond.

Okay, and we put all of these together. So we have the system template, the human template, and the AI template, like know that we're using AI message prompt template, human message prompt template, and system message prompt template for each of those. And what we do is create a list of messages.

So it goes obviously the system message first, the human message second, and the AI message third. And these are the templates, right? So what we then do is we take our chat prompt, which is a list of these, and we format that prompt with our input. So we have the character limit, which we're going to set to 50, so half of what we had before, making it harder.

I'm going to say the sign off has to be this robot, and the input is going to be the same as before. And then we're giving an example response, right? So good is the physics of small things. That example response is going to automatically have the sign off added to it.

All right, so let's run this, and let's see what we get. So system message, you are a helpful assistant. You keep responses, no more than 50 characters long. You sign off every message with robot McRobot, so we can see where those are being added there. Human message, hi AI, what is quantum physics?

So it's, you know, because we're just passing the input in there. And then we have the AI message. Good, it's physics of small things, robot McRobot, okay? Very short answer. And let's just see if that helps the system produce just very short answers. So we run this, and we get atoms, electrons, photons, and then it does the sign off.

So I think that's a pretty good response. Let's try again. Right, so here we go slightly over. So we get like four characters over there. So maybe we can be more strict again. So what we can do is we add in that template that we used before where we add in the answer in less than the character limit, including white space.

Okay, we're going to add that to our human message. So we're going to create the human message like this. So the chat prompt template and so on and so on. Okay, cool. So is it like particle physics? That's why I asked before, yeah. So asking the same question, but we're adding that onto the end.

So it's like particle physics, answering less than 50 characters, including white space. Then what I'm going to do is, so within the messages right now, we have this query that we created before, where we need to replace that query with our new modified query. So I'm going to remove the most recent message in messages, and I'm going to send it with this new human prompt value, which is this kind of new version with those instructions added to the end.

So let's have a look, make sure we have the right format. So system, human, AI, human, AI. That's the last correct response we got from the AI. And now we have the new modified human message. Okay, cool. So it's like this, answering less than 50 characters. And now we pass that through our chat system again, and we get way shorter.

So 28, it's like, yes, similar. Because we're saying, we're telling it in the most recent query again, but you need to answer in less than 50 characters. All right, so what I mentioned before is that maybe this is a little bit convoluted, and that's not to say that there aren't use cases for this.

It's just that it would be unfair of me to tell you all of this and be like, this is how you use it. And then like, just miss something that could make things much easier in most, at least most use cases, or simpler use cases, or something along those lines.

All right, so I would say it's arguable as to whether all of the above that we just did would be any simpler than using an F-string. So we have this input. Okay, cool, is it like particle physics? That's our most recent question, right? And we can just use an F-string, right?

So we have the F-string here, and we have the human message, the content, and then we just say, answering less than the character limit, which is set here, characters, including whitespace, right? And the result of that is basically the same. Look, we have this, right? That's the same as all of this code here.

So now all of this code, is that right? Yeah, plus this. So it depends, I don't know, it depends on your use case, like what you're doing, how you prefer to write this, but just be aware that you can also do this and you get the same result. So now we can see again, popping the last message to remove the one that we created using the prompt template, and then I'm adding the one that we created using the F-string approach, and we get this, right, it's the same thing.

There's no difference there, we can process it through chat GPT again, and we'll get the same response. Okay, so just wanted to make you aware of that. But yeah, that's it for this video. We've covered, I think the vast majority of the new chat features within the Limechain, and naturally, like we saw at the end there, we don't need to use all of them.

Like the prompt templates, you can use of course, if you have a reason to, but it isn't needed if you have a simpler approach to doing these things. But yeah, it's cool to see this being implemented in Limechain, and although I haven't been through it yet, I'm hoping that there will be good integrations of these new chat features with like their conversation memory, their retrieval augmentation, and everything else within Limechain, which is, that's where the value of this sort of thing will come in.

Right now, it's kind of like a simple wrapper on top of OpenAI's chat completion endpoint, but hopefully with all of the agents, conversation memory, and retrieval augmentation components that Limechain offers, we'll get a tight integration between those, and that's where this will be useful. So, that's it for this video.

I hope all of this has been useful and interesting. But for now, thank you very much for watching, and I will see you again in the next one. Bye. (gentle music) (gentle music) (gentle music) (gentle music) you you