back to index

Prompt Templates for GPT 3.5 and other LLMs - LangChain #2


0:0 Why prompts are important
2:42 Structure of prompts
4:10 Langchain code Setup
5:56 Langchain's PromptTemplates
8:34 Few shot learning with LLMs
13:4 Few shot prompt templates in Langchain
16:9 Length-based example selectors
21:19 Other Langchain example selectors
22:12 Final notes on prompts + Langchain

Whisper Transcript | Transcript Only Page

00:00:00.000 | Today we're taking another look at the lang chain library
00:00:02.840 | and we're gonna be focusing on what are called
00:00:05.040 | prompt templates, which are a very core component
00:00:07.960 | of the library and this mirrors the usefulness of prompts
00:00:12.200 | for large language models in general.
00:00:14.560 | Now, although prompts maybe don't seem as interesting
00:00:17.840 | as the models themselves,
00:00:19.760 | they're actually a very critical component,
00:00:22.280 | particularly in today's world of large language models.
00:00:25.640 | The reason I say that is in the past,
00:00:28.960 | when we consider different tasks within language,
00:00:32.560 | we all use different models for those.
00:00:34.720 | So named entity recognition or question answering,
00:00:37.960 | there were different models trained
00:00:39.280 | for each one of those purposes.
00:00:40.680 | Now, the separation between these tasks
00:00:43.960 | has over time decreased
00:00:46.560 | with the introduction of transform models,
00:00:48.880 | it became the case that you would pre-train
00:00:51.320 | a single big language model like BERT
00:00:54.160 | and then you would form transfer learning
00:00:56.640 | in order to just change a couple of layers
00:00:58.320 | at the end of the network
00:00:59.840 | in order to adapt it to different tasks.
00:01:02.280 | And with the more recent adoption of large language models,
00:01:07.080 | the separation between different use cases
00:01:10.520 | has decreased even more.
00:01:13.200 | So now, rather than actually using different layers
00:01:17.000 | at the end of the same model,
00:01:19.240 | like we did with transform models
00:01:20.840 | or just using completely different models
00:01:22.600 | as we may have done before using transform models,
00:01:25.840 | we now actually use the same model
00:01:28.680 | for completely different tasks.
00:01:30.240 | Now, things like named entity recognition,
00:01:32.600 | question answering, summarization, even translation
00:01:36.440 | are all done by the same models.
00:01:39.200 | The only thing that actually changes now is the prompt,
00:01:42.200 | the input that we feed into the model.
00:01:44.240 | We literally just say, can you do this?
00:01:47.000 | Or can you do something else, right?
00:01:49.440 | That is all we're changing now.
00:01:51.360 | So the models themselves no longer need changing,
00:01:54.200 | it's actually just the inputs to those models that change
00:01:57.960 | in order for us to modify the task that we are performing.
00:02:02.680 | So with large language models,
00:02:04.360 | it turns out that prompts are the most important thing
00:02:08.160 | for us to learn.
00:02:09.200 | The large language models themselves
00:02:10.560 | have already been trained.
00:02:11.560 | Sure, we can fine tune them.
00:02:13.120 | Sure, we can add an external knowledge base.
00:02:15.840 | Sure, we can do all these other things.
00:02:17.480 | But in the end, one of the key components for us to learn
00:02:20.960 | when we're using large language models
00:02:22.760 | is how to do prompts correctly.
00:02:24.720 | The Langtrain library recognizes the importance of prompts
00:02:28.480 | and they have built an entire object class
00:02:30.800 | or a few object classes for them.
00:02:33.120 | In this video, that's what we're going to talk about.
00:02:35.560 | We're going to talk about prompt templates
00:02:37.360 | and a few of the other things
00:02:39.440 | that are kind of parallel to prompt templates.
00:02:41.960 | So we're going to get started with just taking a look
00:02:44.800 | at this simple prompt here and we're going to break it down.
00:02:47.800 | So the top here, we have the instructions of the prompt.
00:02:51.000 | Here, we have context or external information.
00:02:53.880 | Here, a query.
00:02:55.280 | And here is what we call a output indicator.
00:02:58.560 | Now, each one of these components serves a purpose.
00:03:01.760 | But actually, when we look at maybe
00:03:03.480 | what we would usually put into a large language model
00:03:06.560 | or what a user would put into a large language model,
00:03:08.840 | it's only this little bit here, okay?
00:03:11.160 | This little bit, which libraries and model providers
00:03:13.160 | offer large language models?
00:03:14.720 | That's all we're actually expecting our users
00:03:16.320 | to put in there.
00:03:17.160 | So that would be our query.
00:03:19.120 | And what we're actually doing here is considering
00:03:22.000 | that that's the only thing our user is going to be inputting.
00:03:24.840 | We're actually providing all this other information
00:03:26.720 | in order to kind of guide the large language model
00:03:30.520 | to answer the question in the way
00:03:33.040 | that we think our user would like the question
00:03:35.480 | to be answered.
00:03:36.320 | In this case, we're doing what I would call factual Q&A,
00:03:39.720 | which is what you can see here,
00:03:41.640 | answer the question based on the context below.
00:03:44.320 | So based on this information here.
00:03:46.520 | If it can't be answered, I want you to say,
00:03:48.920 | I don't know, okay?
00:03:50.360 | That's what I would call factual Q&A.
00:03:52.000 | So we basically don't want to answer the question
00:03:54.480 | if we can't find the information behind that answer.
00:03:58.240 | And the reason we might want to do that
00:03:59.400 | is because large language models
00:04:01.120 | have this very strong tendency to make things up
00:04:04.360 | and make it seem super convincing.
00:04:06.000 | So it can be good to do this sort of thing
00:04:08.320 | in order to avoid that.
00:04:09.920 | Now, let's go to our code.
00:04:12.160 | There are a few things that we need to install here.
00:04:14.440 | So pip install, LangChain and OpenAI.
00:04:17.400 | Of course, you can do this with other frameworks as well.
00:04:20.800 | It doesn't have to be OpenAI.
00:04:22.000 | You can use Cohere, you can use Hugging Face.
00:04:24.680 | It's completely up to you.
00:04:25.960 | But for what we're doing here,
00:04:27.520 | the OpenAI model is very good.
00:04:30.360 | So here's our prompt.
00:04:31.880 | This is exactly the same as what I showed you before.
00:04:33.880 | So I'm going to run this.
00:04:35.360 | And you can see, I just can't explain
00:04:37.200 | what I just explained there.
00:04:38.360 | Instructions, context, question, output indicator.
00:04:41.320 | So using LangChain,
00:04:44.280 | first thing we want to do is actually initialize a model.
00:04:47.480 | So I'm going to go OpenAI API key here.
00:04:51.000 | So I've already created this variable.
00:04:53.480 | This just contains my API key,
00:04:55.560 | which you can get from,
00:04:58.040 | if you click on this link here,
00:04:59.280 | there will be a link to this notebook,
00:05:03.080 | either at the top of the screen now
00:05:04.320 | or in the video description.
00:05:05.920 | Otherwise, what you can do is head over
00:05:08.120 | to this web address here.
00:05:09.560 | So
00:05:13.400 | That may change in the future,
00:05:15.160 | but for now, that is where you would go.
00:05:17.200 | So we initialize our model.
00:05:19.440 | I'm going to run this.
00:05:20.840 | And then what we can do is,
00:05:22.800 | we're going to just make a generation
00:05:24.400 | from the prompt that we just created.
00:05:25.720 | So the prompt up here, right?
00:05:28.400 | This is just going to extract a few things.
00:05:31.640 | Okay, looks good.
00:05:33.120 | The only problem is like,
00:05:34.240 | we wouldn't typically want to write all of this
00:05:36.160 | within our prompt in this format, right?
00:05:38.920 | So like facades here is the user's query.
00:05:42.680 | So the user should be inputting whatever is here.
00:05:46.480 | And as well as that, we have these contexts here.
00:05:49.640 | This would actually come in
00:05:51.000 | as an external source of information.
00:05:52.680 | So we also wouldn't hard-code that into our code either.
00:05:56.240 | So Lightning Chain has something called prompt templates,
00:05:59.680 | which will help us handle this.
00:06:02.440 | Now, for now, I'm just going to keep the context in there.
00:06:05.160 | But what I am going to do is replace the query,
00:06:08.160 | the user's query with this.
00:06:10.120 | It just kind of looks like an F string in Python,
00:06:14.000 | but it's not an F string.
00:06:15.240 | Otherwise, we would have this up here.
00:06:17.680 | In fact, it is actually just a plain string,
00:06:21.320 | but it will be interpreted like that by the prompt template.
00:06:25.640 | So what we need to do here is just replace
00:06:28.840 | where we would expect the query to go with query.
00:06:32.640 | And we need to make sure that within the input variables
00:06:35.280 | of this prompt template object,
00:06:37.200 | which we have imported here,
00:06:39.160 | we need to make sure that this here
00:06:41.520 | aligns with our F string type thing here.
00:06:45.600 | And then after that, we just have our template,
00:06:48.000 | which is obviously just this here.
00:06:50.080 | And that will create our prompt template.
00:06:52.640 | Now, if we would like, we can just insert a query here.
00:06:57.000 | Okay, so you can see what I'm doing.
00:06:58.720 | We have prompt template.format.
00:07:01.760 | Now we have this query,
00:07:02.840 | which is just going to be the question
00:07:04.640 | that we had before, right?
00:07:05.880 | And we can run this, print it,
00:07:08.840 | and we can see that we now have the same text,
00:07:10.920 | but now we have our query in there instead.
00:07:12.960 | And we can change this,
00:07:14.400 | like what is a large language model or something?
00:07:17.520 | What is a large language model?
00:07:19.480 | Right, we could put that in there
00:07:21.040 | and it would change our query here.
00:07:23.560 | Now, in this case,
00:07:25.200 | we don't actually have an external knowledge-based setup,
00:07:28.800 | so the context doesn't change.
00:07:30.080 | That's fine, this is just an example.
00:07:32.440 | We don't need to worry about that right now.
00:07:35.040 | So what I'm going to do is take the first example
00:07:37.920 | where we have prompt template
00:07:39.600 | and we have the actual question,
00:07:41.440 | and we're going to feed it into this OpenAI object here.
00:07:44.520 | This here is actually our large language model
00:07:47.440 | that we just initialized.
00:07:49.680 | And if we run that, we should get this here.
00:07:54.680 | Okay, so we basically, within these few lines of code,
00:07:59.920 | we kind of just replicated what we did up here,
00:08:03.240 | but a little more dynamically.
00:08:05.720 | So let's come back down and what I'm going to do
00:08:09.680 | is show you why we would actually use this
00:08:12.960 | because honestly, right now, it seems kind of pointless.
00:08:16.480 | For example, we could just put this as an F-string
00:08:19.800 | and write some little code around it.
00:08:22.360 | It wouldn't be that hard.
00:08:23.480 | So what is the point of actually using this?
00:08:25.920 | Well, one, it's just nice, it's easy, it's clean,
00:08:29.520 | but two, this isn't the only thing it does.
00:08:32.520 | So if we come down, we can also do something like this.
00:08:36.480 | So this is called a few-shot prompt templates.
00:08:39.320 | Now, this few-shot prompt template object
00:08:42.760 | is ideal for doing something
00:08:45.280 | that we would call few-shot learning
00:08:47.440 | for our large language models.
00:08:49.720 | And what few-shot learning refers to
00:08:53.360 | is the idea of feeding in a few examples
00:08:57.840 | into a already trained model
00:09:01.160 | and essentially training it on those few examples
00:09:04.240 | so that it can then actually perform well
00:09:07.280 | on a slightly different domain.
00:09:09.800 | Now, the approach to few-shot learning can vary.
00:09:12.520 | In the more traditional sense,
00:09:14.200 | it would be that you're feeding in a few items to the model
00:09:17.840 | and training it on those few items
00:09:19.640 | as you usually would train a ML model.
00:09:22.680 | In this case, we're actually feeding these examples
00:09:26.040 | into the model via the prompt.
00:09:28.640 | But this actually, it seems weird,
00:09:30.760 | but it makes sense because with large language models,
00:09:35.000 | there are two primary sources of knowledge.
00:09:38.400 | Those are the parametric knowledge
00:09:40.760 | and the source knowledge.
00:09:42.200 | The parametric knowledge is the knowledge
00:09:45.320 | that the large language model has learned during training
00:09:49.440 | and stored within the model's weights.
00:09:52.920 | So something like, who was the first man on the moon?
00:09:56.160 | The model is going to be able to answer Neil Armstrong
00:09:59.040 | because it's already learned that information
00:10:00.840 | during training and it's managed to store that information
00:10:03.640 | within the model weights.
00:10:05.280 | The other type of knowledge, source knowledge, is different.
00:10:08.200 | That is where you're actually feeding the knowledge
00:10:11.840 | into the model at inference time via the model input,
00:10:16.520 | i.e. via the prompt.
00:10:19.000 | So considering all of this,
00:10:20.720 | the idea behind Lionchain's few-shot prompt template object
00:10:25.240 | is to provide few-shot learning via the source knowledge,
00:10:29.960 | via the prompt.
00:10:31.080 | And to do this, we just add a few examples to the prompts
00:10:35.600 | that the model will then read
00:10:37.120 | as it's reading everything else.
00:10:39.040 | So you remember earlier on, we had the instructions,
00:10:42.520 | context, query, and output indicator.
00:10:46.320 | In this case, it would be like we have instructions,
00:10:49.480 | examples, query, and output indicator.
00:10:53.240 | Now, let's take a look at where we might want to use this.
00:10:56.720 | Now, in this prompt, I'm saying the following
00:10:59.320 | is a conversation with an AI system.
00:11:01.520 | It is typically sarcastic and witty,
00:11:04.080 | producing creative and amusing responses to the questions.
00:11:07.720 | Here are some examples.
00:11:09.320 | Actually, we're not doing that yet, so let's remove that.
00:11:12.640 | So this is all we have, right?
00:11:14.960 | So we have the instruction,
00:11:16.920 | and then we have what would be the user's query,
00:11:19.640 | and then we have the output indicator.
00:11:21.440 | We set the temperature here to one
00:11:23.120 | so that just increases the randomness of the output,
00:11:25.840 | i.e., it will make it more creative,
00:11:28.760 | and then we can run this, right?
00:11:30.920 | And we get the meaning of life is whatever you make of it.
00:11:33.640 | I mean, to me, it's not sarcastic.
00:11:36.560 | It's not witty or creative.
00:11:38.400 | It's not funny.
00:11:39.240 | So it's not really doing what I want it to do.
00:11:41.800 | So what we can do here is do few-shot learning.
00:11:46.800 | So this is the same.
00:11:49.600 | I've just added here are some examples onto the end there,
00:11:52.960 | and then I'm just adding a couple of examples.
00:11:56.200 | So kind of like sarcastic responses to our user's questions.
00:12:01.200 | How are you?
00:12:02.440 | I can't complain.
00:12:03.640 | What time is it?
00:12:04.480 | It's time to get a watch.
00:12:05.480 | And then I'm going to ask the same question again
00:12:07.320 | at the end, and then we'll see what the model outputs.
00:12:11.240 | And it's not perfect,
00:12:12.240 | but we are more likely to get
00:12:15.200 | kind of like a less serious answer
00:12:17.440 | by putting in these less serious responses.
00:12:21.120 | Now we can probably fine tune this.
00:12:23.160 | Like we can say the assistant is always sarcastic and witty.
00:12:28.160 | Here are some examples, like we can cut this bit out.
00:12:32.280 | And that might help us produce more precise answers.
00:12:37.000 | I need to edit this bit.
00:12:40.320 | And here we get quite a sarcastic answer
00:12:42.040 | of you need to ask someone who's actually living it,
00:12:44.960 | which I think is quite good.
00:12:46.840 | Try a few more.
00:12:48.120 | Okay, somewhere between 42 and a double cheeseburger.
00:12:51.600 | It's good.
00:12:52.520 | 42 again, 42 again, and so on.
00:12:56.440 | So we're getting pretty good answers.
00:12:58.200 | I think we should have gone with this prompt from the start.
00:13:01.920 | Now we come down here.
00:13:04.680 | What we can do is just show you
00:13:06.920 | how these few-shot prompt templates work.
00:13:09.680 | So we import few-shot prompt templates,
00:13:12.520 | and we create these examples.
00:13:14.280 | The examples, each one of them is going to have a query,
00:13:16.600 | and an answer.
00:13:17.600 | Okay, so you can see that here,
00:13:19.520 | this here would be our query,
00:13:21.560 | and this would be our answer.
00:13:23.480 | Okay, so we initialize that,
00:13:24.680 | and then we create what is called a example template.
00:13:28.680 | Same thing as before, it looks like an F-string,
00:13:31.720 | but it actually isn't, or at least not yet.
00:13:35.280 | So we use the example template,
00:13:37.200 | and what we do is we actually create a new prompt template
00:13:40.640 | based on this example template.
00:13:43.840 | Okay, so we're creating like a example prompt.
00:13:47.960 | So it's going to take in the query and an answer this time.
00:13:52.000 | Then we need to break apart our previous prompt
00:13:56.280 | into smaller components.
00:13:58.880 | Okay, so I'm going to, here are a few samples.
00:14:03.640 | We're going to use the same one as we used before.
00:14:06.840 | So this is just the instruction,
00:14:08.720 | and then the suffix here is essentially,
00:14:11.960 | well, actually we have two things.
00:14:13.160 | We have the query itself that the user is going to put in,
00:14:16.480 | and then we have the output indicator.
00:14:18.280 | Then we go ahead and actually initialize
00:14:20.720 | our few-shot prompt template.
00:14:22.480 | We have our examples, which is this list up here.
00:14:26.240 | Also, one thing we should note here
00:14:27.920 | is that these, for every single example,
00:14:30.560 | needs to line up to this, okay?
00:14:33.760 | We have our example prompt, which we have initialized here.
00:14:38.160 | We have the prefix, suffix, input variables, right?
00:14:42.280 | This is not the same as what we have coming into here,
00:14:46.840 | because this is actually just a query from the user.
00:14:49.600 | So it needs to satisfy this part here.
00:14:51.640 | And then we have this example separator.
00:14:53.880 | So example separator is just what it's going to use
00:14:56.600 | to separate each one of those examples
00:14:59.080 | within the prompt that we're building.
00:15:01.080 | So let's run this, and we're going to say,
00:15:04.000 | what is the meaning of life again?
00:15:06.120 | And we'll just print this out so we can see.
00:15:08.400 | So the following excerpts, so on and so on,
00:15:11.440 | this is the same as before.
00:15:13.280 | And we see that we've separated each one of these
00:15:14.800 | with two new lines.
00:15:16.440 | We say, you know, we have all those examples
00:15:19.040 | that we fed in through that list, okay?
00:15:22.080 | And then to generate with this, we do the same again.
00:15:25.360 | Okay, so we have our few-shot prompt template.
00:15:27.160 | We use format query to run this.
00:15:29.960 | Okay, run it again.
00:15:31.480 | Doesn't like whatever I've done to the prompt.
00:15:33.400 | So let me come, here are some examples.
00:15:37.200 | So let's change this to some examples.
00:15:39.320 | I don't think that should make a big difference.
00:15:41.280 | And I'll just change the separator a little bit as well.
00:15:45.280 | Okay, and then we get our sort of joke answers, 42 again.
00:15:50.280 | Okay, so we get a few good responses.
00:15:54.160 | Again, it's not perfect, but it's just an example.
00:15:57.400 | Now, what I actually want to show you now is,
00:16:00.640 | why would we also use this over just feeding things in
00:16:03.880 | with an fString?
00:16:05.200 | Well, there's also a little bit more logic that we can use.
00:16:08.320 | So in a lot of cases, naturally,
00:16:11.960 | as with typical machine learning models,
00:16:14.720 | it's better to feed in more examples for training
00:16:18.240 | than less examples.
00:16:20.240 | And we should try and do that as well
00:16:22.760 | with what we are doing here,
00:16:24.760 | whether it's with feeding the examples into our prompt.
00:16:28.320 | So what I've done here is created a lot of these
00:16:32.600 | kind of examples, and we're just going to,
00:16:36.720 | yeah, we can just run these.
00:16:38.400 | Now, we're going to want to feed in
00:16:39.920 | as many of these samples as possible,
00:16:41.600 | but at the same time, we might want to limit
00:16:44.800 | the number of examples we're actually feeding into it.
00:16:47.320 | So there are a few reasons for this.
00:16:49.920 | One, we don't want to create excessive texts
00:16:53.680 | that are separating the instructions and the query itself.
00:16:58.680 | Sometimes that can be distracting for the model.
00:17:02.160 | And on the other hand, we can actually add in
00:17:05.480 | so many examples that we exceed the maximum context window
00:17:09.560 | that the model allows.
00:17:10.760 | So that's basically the number of tokens
00:17:14.880 | from your query or from your prompt
00:17:17.960 | and from your generation.
00:17:21.000 | You add those back together,
00:17:22.160 | and that creates your context window.
00:17:24.640 | Every model, including the OpenAI model we're using here,
00:17:28.680 | has a maximum context window,
00:17:30.760 | and we can't exceed that,
00:17:31.600 | otherwise we're going to throw an error.
00:17:32.960 | So we definitely don't want to go over that limit.
00:17:35.080 | And another thing we might want to consider
00:17:36.880 | is that we don't want to use too many tokens
00:17:40.560 | because it costs money to run this.
00:17:42.960 | So we might also want to limit the number of examples
00:17:45.600 | we're bringing through because of that.
00:17:47.760 | And we might want to limit the number of examples
00:17:51.000 | based on how long the user's query is.
00:17:54.560 | So if the user just has like a small three-word query,
00:17:57.920 | we can include more of our examples.
00:18:01.120 | If the user is like kind of writing us
00:18:04.120 | a little bit of a poem,
00:18:05.760 | then we might want to limit the number of examples
00:18:08.560 | we're bringing through.
00:18:10.200 | And that is where we would use something like this.
00:18:12.640 | So there are a few of these,
00:18:14.720 | what we'd call example selectors.
00:18:17.200 | The most basic of those
00:18:18.880 | is called the length-based example selector.
00:18:22.160 | With a length-based example selector,
00:18:24.800 | we would feed in our list of examples,
00:18:27.080 | we'd feed in our example prompt that we created earlier,
00:18:30.880 | and then we'd also select the maximum length.
00:18:33.920 | What we're doing here anyway,
00:18:34.960 | the default setting is super simple.
00:18:37.760 | All we're doing is splitting
00:18:39.360 | based on new line characters or white space.
00:18:41.960 | So for example, with this text here,
00:18:44.880 | in this first bit, we have eight words,
00:18:48.480 | and then here we have another six words.
00:18:51.200 | So we can split based on new lines and spaces
00:18:54.400 | and we will get this, okay?
00:18:56.640 | And here is the number of words that we have there.
00:18:59.600 | That is all that this is doing.
00:19:01.200 | So when we set max length,
00:19:02.840 | that's where setting the max length
00:19:04.800 | for the number of separate tokens
00:19:06.600 | based on white space and new lines.
00:19:09.800 | So from here, we're going to initialize
00:19:12.200 | the what I'm gonna call dynamic prompt template.
00:19:15.880 | Now, this is just a dynamic version
00:19:19.120 | of our few-shot prompt template.
00:19:20.960 | So in here before, we just put in the examples.
00:19:25.320 | Okay, so we had examples equals examples.
00:19:28.120 | That's just saying feed in all the examples every time.
00:19:31.160 | This time, we've already fed in our list of examples
00:19:34.280 | to the example selector up here.
00:19:37.400 | So we can actually use this example selector
00:19:40.600 | to select from those examples,
00:19:43.680 | a certain number of them based on whatever prompt
00:19:47.240 | this few-shot prompt template will receive later on.
00:19:50.440 | So let's run this.
00:19:52.480 | And actually I need to run up here as well.
00:19:56.760 | So run this.
00:19:58.160 | And what we're gonna do is just quite a small prompt here.
00:20:01.080 | So this would be four tokens, run this.
00:20:04.480 | And we can see there are a few examples here.
00:20:06.760 | So we have four examples in total
00:20:10.840 | before we get to our final part here, right?
00:20:13.800 | And then if we wanted to run that,
00:20:15.120 | we again just pass it through OpenAI, right?
00:20:17.880 | And we get this kind of sarcastic, jokey answer.
00:20:21.920 | Now, let's try and ask a longer question.
00:20:24.800 | So this is what I mean when I'm saying occasionally
00:20:26.960 | maybe someone is going to write you a poem
00:20:29.160 | when they're querying something.
00:20:31.200 | So we have, they're kind of just rambling on, right?
00:20:34.840 | It's much longer.
00:20:35.880 | So what happens if we query with this?
00:20:38.800 | Okay, we can see straight away
00:20:42.520 | that we actually get just one example being pulled through.
00:20:45.560 | So because this is a much longer question,
00:20:48.840 | we're not including as many examples.
00:20:51.080 | And of course we can modify this
00:20:53.840 | as to what makes sense for us.
00:20:56.200 | So we can increase the max length here
00:20:59.560 | and we'll just rerun everything there.
00:21:02.120 | So we have the prompt template, we recreate it
00:21:06.040 | and then run that again with the same long question.
00:21:10.480 | Okay, here.
00:21:11.680 | And we can see that we're actually now
00:21:12.960 | including five various samples
00:21:14.400 | because we've just doubled the number of example words
00:21:17.640 | that are allowed through.
00:21:19.760 | Now, this is just a small example
00:21:22.840 | of what we can do with prompt templates.
00:21:25.360 | For example, if we wanted to use
00:21:27.480 | different example selectors, we can do.
00:21:29.800 | So I showed you the very simple
00:21:32.120 | length-based example selector here.
00:21:34.920 | But we can do what I think is better things
00:21:37.400 | with this as well.
00:21:39.000 | So we can actually base the samples
00:21:40.720 | that we include on similarity.
00:21:43.720 | So we embed our examples as vector embeddings
00:21:47.360 | and then we calculate similarity between them
00:21:49.760 | in order to, when we're asking a question,
00:21:52.600 | always try to include relevant examples
00:21:56.800 | rather than just kind of filling up with examples
00:21:59.000 | that are maybe not so relevant to the current query.
00:22:02.600 | And then there are a few other ones as well.
00:22:04.920 | This one is very new, the Ngram overlap example selector.
00:22:09.360 | And we're going to cover all of these
00:22:10.920 | at some point in a future video.
00:22:12.760 | But for now, that's it for this video.
00:22:16.920 | You know, as you've seen, we've just gone through
00:22:19.720 | the basics of prompt templates and a few short
00:22:22.520 | prompt templates with a very simple example selector.
00:22:25.760 | And for a lot of use cases,
00:22:27.040 | that's probably all you're going to need.
00:22:29.120 | So with that in mind, I'm going to leave it there
00:22:32.520 | for this video.
00:22:33.720 | So thank you very much for watching.
00:22:36.880 | I hope this has been useful and interesting
00:22:38.920 | and I will see you again in the next one, bye.
00:22:41.320 | (soft music)
00:22:43.720 | (soft music)
00:22:46.120 | (soft music)
00:22:48.520 | (soft music)
00:22:50.920 | (soft music)
00:22:53.320 | (soft music)
00:22:55.720 | (soft music)