back to indexPrompt Templates for GPT 3.5 and other LLMs - LangChain #2
Chapters
0:0 Why prompts are important
2:42 Structure of prompts
4:10 Langchain code Setup
5:56 Langchain's PromptTemplates
8:34 Few shot learning with LLMs
13:4 Few shot prompt templates in Langchain
16:9 Length-based example selectors
21:19 Other Langchain example selectors
22:12 Final notes on prompts + Langchain
00:00:00.000 |
Today we're taking another look at the lang chain library 00:00:02.840 |
and we're gonna be focusing on what are called 00:00:05.040 |
prompt templates, which are a very core component 00:00:07.960 |
of the library and this mirrors the usefulness of prompts 00:00:14.560 |
Now, although prompts maybe don't seem as interesting 00:00:22.280 |
particularly in today's world of large language models. 00:00:28.960 |
when we consider different tasks within language, 00:00:34.720 |
So named entity recognition or question answering, 00:01:02.280 |
And with the more recent adoption of large language models, 00:01:13.200 |
So now, rather than actually using different layers 00:01:22.600 |
as we may have done before using transform models, 00:01:32.600 |
question answering, summarization, even translation 00:01:39.200 |
The only thing that actually changes now is the prompt, 00:01:51.360 |
So the models themselves no longer need changing, 00:01:54.200 |
it's actually just the inputs to those models that change 00:01:57.960 |
in order for us to modify the task that we are performing. 00:02:04.360 |
it turns out that prompts are the most important thing 00:02:17.480 |
But in the end, one of the key components for us to learn 00:02:24.720 |
The Langtrain library recognizes the importance of prompts 00:02:33.120 |
In this video, that's what we're going to talk about. 00:02:39.440 |
that are kind of parallel to prompt templates. 00:02:41.960 |
So we're going to get started with just taking a look 00:02:44.800 |
at this simple prompt here and we're going to break it down. 00:02:47.800 |
So the top here, we have the instructions of the prompt. 00:02:51.000 |
Here, we have context or external information. 00:02:58.560 |
Now, each one of these components serves a purpose. 00:03:03.480 |
what we would usually put into a large language model 00:03:06.560 |
or what a user would put into a large language model, 00:03:11.160 |
This little bit, which libraries and model providers 00:03:14.720 |
That's all we're actually expecting our users 00:03:19.120 |
And what we're actually doing here is considering 00:03:22.000 |
that that's the only thing our user is going to be inputting. 00:03:24.840 |
We're actually providing all this other information 00:03:26.720 |
in order to kind of guide the large language model 00:03:33.040 |
that we think our user would like the question 00:03:36.320 |
In this case, we're doing what I would call factual Q&A, 00:03:41.640 |
answer the question based on the context below. 00:03:52.000 |
So we basically don't want to answer the question 00:03:54.480 |
if we can't find the information behind that answer. 00:04:01.120 |
have this very strong tendency to make things up 00:04:12.160 |
There are a few things that we need to install here. 00:04:17.400 |
Of course, you can do this with other frameworks as well. 00:04:22.000 |
You can use Cohere, you can use Hugging Face. 00:04:31.880 |
This is exactly the same as what I showed you before. 00:04:38.360 |
Instructions, context, question, output indicator. 00:04:44.280 |
first thing we want to do is actually initialize a model. 00:05:34.240 |
we wouldn't typically want to write all of this 00:05:42.680 |
So the user should be inputting whatever is here. 00:05:46.480 |
And as well as that, we have these contexts here. 00:05:52.680 |
So we also wouldn't hard-code that into our code either. 00:05:56.240 |
So Lightning Chain has something called prompt templates, 00:06:02.440 |
Now, for now, I'm just going to keep the context in there. 00:06:05.160 |
But what I am going to do is replace the query, 00:06:10.120 |
It just kind of looks like an F string in Python, 00:06:21.320 |
but it will be interpreted like that by the prompt template. 00:06:28.840 |
where we would expect the query to go with query. 00:06:32.640 |
And we need to make sure that within the input variables 00:06:45.600 |
And then after that, we just have our template, 00:06:52.640 |
Now, if we would like, we can just insert a query here. 00:07:08.840 |
and we can see that we now have the same text, 00:07:14.400 |
like what is a large language model or something? 00:07:25.200 |
we don't actually have an external knowledge-based setup, 00:07:35.040 |
So what I'm going to do is take the first example 00:07:41.440 |
and we're going to feed it into this OpenAI object here. 00:07:44.520 |
This here is actually our large language model 00:07:54.680 |
Okay, so we basically, within these few lines of code, 00:07:59.920 |
we kind of just replicated what we did up here, 00:08:05.720 |
So let's come back down and what I'm going to do 00:08:12.960 |
because honestly, right now, it seems kind of pointless. 00:08:16.480 |
For example, we could just put this as an F-string 00:08:25.920 |
Well, one, it's just nice, it's easy, it's clean, 00:08:32.520 |
So if we come down, we can also do something like this. 00:08:36.480 |
So this is called a few-shot prompt templates. 00:09:01.160 |
and essentially training it on those few examples 00:09:09.800 |
Now, the approach to few-shot learning can vary. 00:09:14.200 |
it would be that you're feeding in a few items to the model 00:09:22.680 |
In this case, we're actually feeding these examples 00:09:30.760 |
but it makes sense because with large language models, 00:09:45.320 |
that the large language model has learned during training 00:09:52.920 |
So something like, who was the first man on the moon? 00:09:56.160 |
The model is going to be able to answer Neil Armstrong 00:09:59.040 |
because it's already learned that information 00:10:00.840 |
during training and it's managed to store that information 00:10:05.280 |
The other type of knowledge, source knowledge, is different. 00:10:08.200 |
That is where you're actually feeding the knowledge 00:10:11.840 |
into the model at inference time via the model input, 00:10:20.720 |
the idea behind Lionchain's few-shot prompt template object 00:10:25.240 |
is to provide few-shot learning via the source knowledge, 00:10:31.080 |
And to do this, we just add a few examples to the prompts 00:10:39.040 |
So you remember earlier on, we had the instructions, 00:10:46.320 |
In this case, it would be like we have instructions, 00:10:53.240 |
Now, let's take a look at where we might want to use this. 00:10:56.720 |
Now, in this prompt, I'm saying the following 00:11:04.080 |
producing creative and amusing responses to the questions. 00:11:09.320 |
Actually, we're not doing that yet, so let's remove that. 00:11:16.920 |
and then we have what would be the user's query, 00:11:23.120 |
so that just increases the randomness of the output, 00:11:30.920 |
And we get the meaning of life is whatever you make of it. 00:11:39.240 |
So it's not really doing what I want it to do. 00:11:41.800 |
So what we can do here is do few-shot learning. 00:11:49.600 |
I've just added here are some examples onto the end there, 00:11:52.960 |
and then I'm just adding a couple of examples. 00:11:56.200 |
So kind of like sarcastic responses to our user's questions. 00:12:05.480 |
And then I'm going to ask the same question again 00:12:07.320 |
at the end, and then we'll see what the model outputs. 00:12:23.160 |
Like we can say the assistant is always sarcastic and witty. 00:12:28.160 |
Here are some examples, like we can cut this bit out. 00:12:32.280 |
And that might help us produce more precise answers. 00:12:42.040 |
of you need to ask someone who's actually living it, 00:12:48.120 |
Okay, somewhere between 42 and a double cheeseburger. 00:12:58.200 |
I think we should have gone with this prompt from the start. 00:13:14.280 |
The examples, each one of them is going to have a query, 00:13:24.680 |
and then we create what is called a example template. 00:13:28.680 |
Same thing as before, it looks like an F-string, 00:13:37.200 |
and what we do is we actually create a new prompt template 00:13:43.840 |
Okay, so we're creating like a example prompt. 00:13:47.960 |
So it's going to take in the query and an answer this time. 00:13:52.000 |
Then we need to break apart our previous prompt 00:13:58.880 |
Okay, so I'm going to, here are a few samples. 00:14:03.640 |
We're going to use the same one as we used before. 00:14:13.160 |
We have the query itself that the user is going to put in, 00:14:22.480 |
We have our examples, which is this list up here. 00:14:33.760 |
We have our example prompt, which we have initialized here. 00:14:38.160 |
We have the prefix, suffix, input variables, right? 00:14:42.280 |
This is not the same as what we have coming into here, 00:14:46.840 |
because this is actually just a query from the user. 00:14:53.880 |
So example separator is just what it's going to use 00:15:13.280 |
And we see that we've separated each one of these 00:15:22.080 |
And then to generate with this, we do the same again. 00:15:25.360 |
Okay, so we have our few-shot prompt template. 00:15:31.480 |
Doesn't like whatever I've done to the prompt. 00:15:39.320 |
I don't think that should make a big difference. 00:15:41.280 |
And I'll just change the separator a little bit as well. 00:15:45.280 |
Okay, and then we get our sort of joke answers, 42 again. 00:15:54.160 |
Again, it's not perfect, but it's just an example. 00:15:57.400 |
Now, what I actually want to show you now is, 00:16:00.640 |
why would we also use this over just feeding things in 00:16:05.200 |
Well, there's also a little bit more logic that we can use. 00:16:14.720 |
it's better to feed in more examples for training 00:16:24.760 |
whether it's with feeding the examples into our prompt. 00:16:28.320 |
So what I've done here is created a lot of these 00:16:44.800 |
the number of examples we're actually feeding into it. 00:16:53.680 |
that are separating the instructions and the query itself. 00:16:58.680 |
Sometimes that can be distracting for the model. 00:17:02.160 |
And on the other hand, we can actually add in 00:17:05.480 |
so many examples that we exceed the maximum context window 00:17:24.640 |
Every model, including the OpenAI model we're using here, 00:17:32.960 |
So we definitely don't want to go over that limit. 00:17:42.960 |
So we might also want to limit the number of examples 00:17:47.760 |
And we might want to limit the number of examples 00:17:54.560 |
So if the user just has like a small three-word query, 00:18:05.760 |
then we might want to limit the number of examples 00:18:10.200 |
And that is where we would use something like this. 00:18:27.080 |
we'd feed in our example prompt that we created earlier, 00:18:30.880 |
and then we'd also select the maximum length. 00:18:51.200 |
So we can split based on new lines and spaces 00:18:56.640 |
And here is the number of words that we have there. 00:19:12.200 |
the what I'm gonna call dynamic prompt template. 00:19:20.960 |
So in here before, we just put in the examples. 00:19:28.120 |
That's just saying feed in all the examples every time. 00:19:31.160 |
This time, we've already fed in our list of examples 00:19:43.680 |
a certain number of them based on whatever prompt 00:19:47.240 |
this few-shot prompt template will receive later on. 00:19:58.160 |
And what we're gonna do is just quite a small prompt here. 00:20:04.480 |
And we can see there are a few examples here. 00:20:17.880 |
And we get this kind of sarcastic, jokey answer. 00:20:24.800 |
So this is what I mean when I'm saying occasionally 00:20:31.200 |
So we have, they're kind of just rambling on, right? 00:20:42.520 |
that we actually get just one example being pulled through. 00:21:02.120 |
So we have the prompt template, we recreate it 00:21:06.040 |
and then run that again with the same long question. 00:21:14.400 |
because we've just doubled the number of example words 00:21:43.720 |
So we embed our examples as vector embeddings 00:21:47.360 |
and then we calculate similarity between them 00:21:56.800 |
rather than just kind of filling up with examples 00:21:59.000 |
that are maybe not so relevant to the current query. 00:22:04.920 |
This one is very new, the Ngram overlap example selector. 00:22:16.920 |
You know, as you've seen, we've just gone through 00:22:19.720 |
the basics of prompt templates and a few short 00:22:22.520 |
prompt templates with a very simple example selector. 00:22:29.120 |
So with that in mind, I'm going to leave it there 00:22:38.920 |
and I will see you again in the next one, bye.