Prompt Templates for GPT 3.5 and other LLMs

00:00:00.000 | Today we're taking another look at the lang chain library

00:00:02.840 | and we're gonna be focusing on what are called

00:00:05.040 | prompt templates, which are a very core component

00:00:07.960 | of the library and this mirrors the usefulness of prompts

00:00:12.200 | for large language models in general.

00:00:14.560 | Now, although prompts maybe don't seem as interesting

00:00:17.840 | as the models themselves,

00:00:19.760 | they're actually a very critical component,

00:00:22.280 | particularly in today's world of large language models.

00:00:25.640 | The reason I say that is in the past,

00:00:28.960 | when we consider different tasks within language,

00:00:32.560 | we all use different models for those.

00:00:34.720 | So named entity recognition or question answering,

00:00:37.960 | there were different models trained

00:00:39.280 | for each one of those purposes.

00:00:40.680 | Now, the separation between these tasks

00:00:43.960 | has over time decreased

00:00:46.560 | with the introduction of transform models,

00:00:48.880 | it became the case that you would pre-train

00:00:51.320 | a single big language model like BERT

00:00:54.160 | and then you would form transfer learning

00:00:56.640 | in order to just change a couple of layers

00:00:58.320 | at the end of the network

00:00:59.840 | in order to adapt it to different tasks.

00:01:02.280 | And with the more recent adoption of large language models,

00:01:07.080 | the separation between different use cases

00:01:10.520 | has decreased even more.

00:01:13.200 | So now, rather than actually using different layers

00:01:17.000 | at the end of the same model,

00:01:19.240 | like we did with transform models

00:01:20.840 | or just using completely different models

00:01:22.600 | as we may have done before using transform models,

00:01:25.840 | we now actually use the same model

00:01:28.680 | for completely different tasks.

00:01:30.240 | Now, things like named entity recognition,

00:01:32.600 | question answering, summarization, even translation

00:01:36.440 | are all done by the same models.

00:01:39.200 | The only thing that actually changes now is the prompt,

00:01:42.200 | the input that we feed into the model.

00:01:44.240 | We literally just say, can you do this?

00:01:47.000 | Or can you do something else, right?

00:01:49.440 | That is all we're changing now.

00:01:51.360 | So the models themselves no longer need changing,

00:01:54.200 | it's actually just the inputs to those models that change

00:01:57.960 | in order for us to modify the task that we are performing.

00:02:02.680 | So with large language models,

00:02:04.360 | it turns out that prompts are the most important thing

00:02:08.160 | for us to learn.

00:02:09.200 | The large language models themselves

00:02:10.560 | have already been trained.

00:02:11.560 | Sure, we can fine tune them.

00:02:13.120 | Sure, we can add an external knowledge base.

00:02:15.840 | Sure, we can do all these other things.

00:02:17.480 | But in the end, one of the key components for us to learn

00:02:20.960 | when we're using large language models

00:02:22.760 | is how to do prompts correctly.

00:02:24.720 | The Langtrain library recognizes the importance of prompts

00:02:28.480 | and they have built an entire object class

00:02:30.800 | or a few object classes for them.

00:02:33.120 | In this video, that's what we're going to talk about.

00:02:35.560 | We're going to talk about prompt templates

00:02:37.360 | and a few of the other things

00:02:39.440 | that are kind of parallel to prompt templates.

00:02:41.960 | So we're going to get started with just taking a look

00:02:44.800 | at this simple prompt here and we're going to break it down.

00:02:47.800 | So the top here, we have the instructions of the prompt.

00:02:51.000 | Here, we have context or external information.

00:02:53.880 | Here, a query.

00:02:55.280 | And here is what we call a output indicator.

00:02:58.560 | Now, each one of these components serves a purpose.

00:03:01.760 | But actually, when we look at maybe

00:03:03.480 | what we would usually put into a large language model

00:03:06.560 | or what a user would put into a large language model,

00:03:08.840 | it's only this little bit here, okay?

00:03:11.160 | This little bit, which libraries and model providers

00:03:13.160 | offer large language models?

00:03:14.720 | That's all we're actually expecting our users

00:03:16.320 | to put in there.

00:03:17.160 | So that would be our query.

00:03:19.120 | And what we're actually doing here is considering

00:03:22.000 | that that's the only thing our user is going to be inputting.

00:03:24.840 | We're actually providing all this other information

00:03:26.720 | in order to kind of guide the large language model

00:03:30.520 | to answer the question in the way

00:03:33.040 | that we think our user would like the question

00:03:35.480 | to be answered.

00:03:36.320 | In this case, we're doing what I would call factual Q&A,

00:03:39.720 | which is what you can see here,

00:03:41.640 | answer the question based on the context below.

00:03:44.320 | So based on this information here.

00:03:46.520 | If it can't be answered, I want you to say,

00:03:48.920 | I don't know, okay?

00:03:50.360 | That's what I would call factual Q&A.

00:03:52.000 | So we basically don't want to answer the question

00:03:54.480 | if we can't find the information behind that answer.

00:03:58.240 | And the reason we might want to do that

00:03:59.400 | is because large language models

00:04:01.120 | have this very strong tendency to make things up

00:04:04.360 | and make it seem super convincing.

00:04:06.000 | So it can be good to do this sort of thing

00:04:08.320 | in order to avoid that.

00:04:09.920 | Now, let's go to our code.

00:04:12.160 | There are a few things that we need to install here.

00:04:14.440 | So pip install, LangChain and OpenAI.

00:04:17.400 | Of course, you can do this with other frameworks as well.

00:04:20.800 | It doesn't have to be OpenAI.

00:04:22.000 | You can use Cohere, you can use Hugging Face.

00:04:24.680 | It's completely up to you.

00:04:25.960 | But for what we're doing here,

00:04:27.520 | the OpenAI model is very good.

00:04:30.360 | So here's our prompt.

00:04:31.880 | This is exactly the same as what I showed you before.

00:04:33.880 | So I'm going to run this.

00:04:35.360 | And you can see, I just can't explain

00:04:37.200 | what I just explained there.

00:04:38.360 | Instructions, context, question, output indicator.

00:04:41.320 | So using LangChain,

00:04:44.280 | first thing we want to do is actually initialize a model.

00:04:47.480 | So I'm going to go OpenAI API key here.

00:04:51.000 | So I've already created this variable.

00:04:53.480 | This just contains my API key,

00:04:55.560 | which you can get from,

00:04:58.040 | if you click on this link here,

00:04:59.280 | there will be a link to this notebook,

00:05:03.080 | either at the top of the screen now

00:05:04.320 | or in the video description.

00:05:05.920 | Otherwise, what you can do is head over

00:05:08.120 | to this web address here.

00:05:09.560 | So platform.openai.com/account/api-keys.

00:05:13.400 | That may change in the future,

00:05:15.160 | but for now, that is where you would go.

00:05:17.200 | So we initialize our model.

00:05:19.440 | I'm going to run this.

00:05:20.840 | And then what we can do is,

00:05:22.800 | we're going to just make a generation

00:05:24.400 | from the prompt that we just created.

00:05:25.720 | So the prompt up here, right?

00:05:28.400 | This is just going to extract a few things.

00:05:31.640 | Okay, looks good.

00:05:33.120 | The only problem is like,

00:05:34.240 | we wouldn't typically want to write all of this

00:05:36.160 | within our prompt in this format, right?

00:05:38.920 | So like facades here is the user's query.

00:05:42.680 | So the user should be inputting whatever is here.

00:05:46.480 | And as well as that, we have these contexts here.

00:05:49.640 | This would actually come in

00:05:51.000 | as an external source of information.

00:05:52.680 | So we also wouldn't hard-code that into our code either.

00:05:56.240 | So Lightning Chain has something called prompt templates,

00:05:59.680 | which will help us handle this.

00:06:02.440 | Now, for now, I'm just going to keep the context in there.

00:06:05.160 | But what I am going to do is replace the query,

00:06:08.160 | the user's query with this.

00:06:10.120 | It just kind of looks like an F string in Python,

00:06:14.000 | but it's not an F string.

00:06:15.240 | Otherwise, we would have this up here.

00:06:17.680 | In fact, it is actually just a plain string,

00:06:21.320 | but it will be interpreted like that by the prompt template.

00:06:25.640 | So what we need to do here is just replace

00:06:28.840 | where we would expect the query to go with query.

00:06:32.640 | And we need to make sure that within the input variables

00:06:35.280 | of this prompt template object,

00:06:37.200 | which we have imported here,

00:06:39.160 | we need to make sure that this here

00:06:41.520 | aligns with our F string type thing here.

00:06:45.600 | And then after that, we just have our template,

00:06:48.000 | which is obviously just this here.

00:06:50.080 | And that will create our prompt template.

00:06:52.640 | Now, if we would like, we can just insert a query here.

00:06:57.000 | Okay, so you can see what I'm doing.

00:06:58.720 | We have prompt template.format.

00:07:01.760 | Now we have this query,

00:07:02.840 | which is just going to be the question

00:07:04.640 | that we had before, right?

00:07:05.880 | And we can run this, print it,

00:07:08.840 | and we can see that we now have the same text,

00:07:10.920 | but now we have our query in there instead.

00:07:12.960 | And we can change this,

00:07:14.400 | like what is a large language model or something?

00:07:17.520 | What is a large language model?

00:07:19.480 | Right, we could put that in there

00:07:21.040 | and it would change our query here.

00:07:23.560 | Now, in this case,

00:07:25.200 | we don't actually have an external knowledge-based setup,

00:07:28.800 | so the context doesn't change.

00:07:30.080 | That's fine, this is just an example.

00:07:32.440 | We don't need to worry about that right now.

00:07:35.040 | So what I'm going to do is take the first example

00:07:37.920 | where we have prompt template

00:07:39.600 | and we have the actual question,

00:07:41.440 | and we're going to feed it into this OpenAI object here.

00:07:44.520 | This here is actually our large language model

00:07:47.440 | that we just initialized.

00:07:49.680 | And if we run that, we should get this here.

00:07:54.680 | Okay, so we basically, within these few lines of code,

00:07:59.920 | we kind of just replicated what we did up here,

00:08:03.240 | but a little more dynamically.

00:08:05.720 | So let's come back down and what I'm going to do

00:08:09.680 | is show you why we would actually use this

00:08:12.960 | because honestly, right now, it seems kind of pointless.

00:08:16.480 | For example, we could just put this as an F-string

00:08:19.800 | and write some little code around it.

00:08:22.360 | It wouldn't be that hard.

00:08:23.480 | So what is the point of actually using this?

00:08:25.920 | Well, one, it's just nice, it's easy, it's clean,

00:08:29.520 | but two, this isn't the only thing it does.

00:08:32.520 | So if we come down, we can also do something like this.

00:08:36.480 | So this is called a few-shot prompt templates.

00:08:39.320 | Now, this few-shot prompt template object

00:08:42.760 | is ideal for doing something

00:08:45.280 | that we would call few-shot learning

00:08:47.440 | for our large language models.

00:08:49.720 | And what few-shot learning refers to

00:08:53.360 | is the idea of feeding in a few examples

00:08:57.840 | into a already trained model

00:09:01.160 | and essentially training it on those few examples

00:09:04.240 | so that it can then actually perform well

00:09:07.280 | on a slightly different domain.

00:09:09.800 | Now, the approach to few-shot learning can vary.

00:09:12.520 | In the more traditional sense,

00:09:14.200 | it would be that you're feeding in a few items to the model

00:09:17.840 | and training it on those few items

00:09:19.640 | as you usually would train a ML model.

00:09:22.680 | In this case, we're actually feeding these examples

00:09:26.040 | into the model via the prompt.

00:09:28.640 | But this actually, it seems weird,

00:09:30.760 | but it makes sense because with large language models,

00:09:35.000 | there are two primary sources of knowledge.

00:09:38.400 | Those are the parametric knowledge

00:09:40.760 | and the source knowledge.

00:09:42.200 | The parametric knowledge is the knowledge

00:09:45.320 | that the large language model has learned during training

00:09:49.440 | and stored within the model's weights.

00:09:52.920 | So something like, who was the first man on the moon?

00:09:56.160 | The model is going to be able to answer Neil Armstrong

00:09:59.040 | because it's already learned that information

00:10:00.840 | during training and it's managed to store that information

00:10:03.640 | within the model weights.

00:10:05.280 | The other type of knowledge, source knowledge, is different.

00:10:08.200 | That is where you're actually feeding the knowledge

00:10:11.840 | into the model at inference time via the model input,

00:10:16.520 | i.e. via the prompt.

00:10:19.000 | So considering all of this,

00:10:20.720 | the idea behind Lionchain's few-shot prompt template object

00:10:25.240 | is to provide few-shot learning via the source knowledge,

00:10:29.960 | via the prompt.

00:10:31.080 | And to do this, we just add a few examples to the prompts

00:10:35.600 | that the model will then read

00:10:37.120 | as it's reading everything else.

00:10:39.040 | So you remember earlier on, we had the instructions,

00:10:42.520 | context, query, and output indicator.

00:10:46.320 | In this case, it would be like we have instructions,

00:10:49.480 | examples, query, and output indicator.

00:10:53.240 | Now, let's take a look at where we might want to use this.

00:10:56.720 | Now, in this prompt, I'm saying the following

00:10:59.320 | is a conversation with an AI system.

00:11:01.520 | It is typically sarcastic and witty,

00:11:04.080 | producing creative and amusing responses to the questions.

00:11:07.720 | Here are some examples.

00:11:09.320 | Actually, we're not doing that yet, so let's remove that.

00:11:12.640 | So this is all we have, right?

00:11:14.960 | So we have the instruction,

00:11:16.920 | and then we have what would be the user's query,

00:11:19.640 | and then we have the output indicator.

00:11:21.440 | We set the temperature here to one

00:11:23.120 | so that just increases the randomness of the output,

00:11:25.840 | i.e., it will make it more creative,

00:11:28.760 | and then we can run this, right?

00:11:30.920 | And we get the meaning of life is whatever you make of it.

00:11:33.640 | I mean, to me, it's not sarcastic.

00:11:36.560 | It's not witty or creative.

00:11:38.400 | It's not funny.

00:11:39.240 | So it's not really doing what I want it to do.

00:11:41.800 | So what we can do here is do few-shot learning.

00:11:46.800 | So this is the same.

00:11:49.600 | I've just added here are some examples onto the end there,

00:11:52.960 | and then I'm just adding a couple of examples.

00:11:56.200 | So kind of like sarcastic responses to our user's questions.

00:12:01.200 | How are you?

00:12:02.440 | I can't complain.

00:12:03.640 | What time is it?

00:12:04.480 | It's time to get a watch.

00:12:05.480 | And then I'm going to ask the same question again

00:12:07.320 | at the end, and then we'll see what the model outputs.

00:12:11.240 | And it's not perfect,

00:12:12.240 | but we are more likely to get

00:12:15.200 | kind of like a less serious answer

00:12:17.440 | by putting in these less serious responses.

00:12:21.120 | Now we can probably fine tune this.

00:12:23.160 | Like we can say the assistant is always sarcastic and witty.

00:12:28.160 | Here are some examples, like we can cut this bit out.

00:12:32.280 | And that might help us produce more precise answers.

00:12:37.000 | I need to edit this bit.

00:12:40.320 | And here we get quite a sarcastic answer

00:12:42.040 | of you need to ask someone who's actually living it,

00:12:44.960 | which I think is quite good.

00:12:46.840 | Try a few more.

00:12:48.120 | Okay, somewhere between 42 and a double cheeseburger.

00:12:51.600 | It's good.

00:12:52.520 | 42 again, 42 again, and so on.

00:12:56.440 | So we're getting pretty good answers.

00:12:58.200 | I think we should have gone with this prompt from the start.

00:13:01.920 | Now we come down here.

00:13:04.680 | What we can do is just show you

00:13:06.920 | how these few-shot prompt templates work.

00:13:09.680 | So we import few-shot prompt templates,

00:13:12.520 | and we create these examples.

00:13:14.280 | The examples, each one of them is going to have a query,

00:13:16.600 | and an answer.

00:13:17.600 | Okay, so you can see that here,

00:13:19.520 | this here would be our query,

00:13:21.560 | and this would be our answer.

00:13:23.480 | Okay, so we initialize that,

00:13:24.680 | and then we create what is called a example template.

00:13:28.680 | Same thing as before, it looks like an F-string,

00:13:31.720 | but it actually isn't, or at least not yet.

00:13:35.280 | So we use the example template,

00:13:37.200 | and what we do is we actually create a new prompt template

00:13:40.640 | based on this example template.

00:13:43.840 | Okay, so we're creating like a example prompt.

00:13:47.960 | So it's going to take in the query and an answer this time.

00:13:52.000 | Then we need to break apart our previous prompt

00:13:56.280 | into smaller components.

00:13:58.880 | Okay, so I'm going to, here are a few samples.

00:14:03.640 | We're going to use the same one as we used before.

00:14:06.840 | So this is just the instruction,

00:14:08.720 | and then the suffix here is essentially,

00:14:11.960 | well, actually we have two things.

00:14:13.160 | We have the query itself that the user is going to put in,

00:14:16.480 | and then we have the output indicator.

00:14:18.280 | Then we go ahead and actually initialize

00:14:20.720 | our few-shot prompt template.

00:14:22.480 | We have our examples, which is this list up here.

00:14:26.240 | Also, one thing we should note here

00:14:27.920 | is that these, for every single example,

00:14:30.560 | needs to line up to this, okay?

00:14:33.760 | We have our example prompt, which we have initialized here.

00:14:38.160 | We have the prefix, suffix, input variables, right?

00:14:42.280 | This is not the same as what we have coming into here,

00:14:46.840 | because this is actually just a query from the user.

00:14:49.600 | So it needs to satisfy this part here.

00:14:51.640 | And then we have this example separator.

00:14:53.880 | So example separator is just what it's going to use

00:14:56.600 | to separate each one of those examples

00:14:59.080 | within the prompt that we're building.

00:15:01.080 | So let's run this, and we're going to say,

00:15:04.000 | what is the meaning of life again?

00:15:06.120 | And we'll just print this out so we can see.

00:15:08.400 | So the following excerpts, so on and so on,

00:15:11.440 | this is the same as before.

00:15:13.280 | And we see that we've separated each one of these

00:15:14.800 | with two new lines.

00:15:16.440 | We say, you know, we have all those examples

00:15:19.040 | that we fed in through that list, okay?

00:15:22.080 | And then to generate with this, we do the same again.

00:15:25.360 | Okay, so we have our few-shot prompt template.

00:15:27.160 | We use format query to run this.

00:15:29.960 | Okay, run it again.

00:15:31.480 | Doesn't like whatever I've done to the prompt.

00:15:33.400 | So let me come, here are some examples.

00:15:37.200 | So let's change this to some examples.

00:15:39.320 | I don't think that should make a big difference.

00:15:41.280 | And I'll just change the separator a little bit as well.

00:15:45.280 | Okay, and then we get our sort of joke answers, 42 again.

00:15:50.280 | Okay, so we get a few good responses.

00:15:54.160 | Again, it's not perfect, but it's just an example.

00:15:57.400 | Now, what I actually want to show you now is,

00:16:00.640 | why would we also use this over just feeding things in

00:16:03.880 | with an fString?

00:16:05.200 | Well, there's also a little bit more logic that we can use.

00:16:08.320 | So in a lot of cases, naturally,

00:16:11.960 | as with typical machine learning models,

00:16:14.720 | it's better to feed in more examples for training

00:16:18.240 | than less examples.

00:16:20.240 | And we should try and do that as well

00:16:22.760 | with what we are doing here,

00:16:24.760 | whether it's with feeding the examples into our prompt.

00:16:28.320 | So what I've done here is created a lot of these

00:16:32.600 | kind of examples, and we're just going to,

00:16:36.720 | yeah, we can just run these.

00:16:38.400 | Now, we're going to want to feed in

00:16:39.920 | as many of these samples as possible,

00:16:41.600 | but at the same time, we might want to limit

00:16:44.800 | the number of examples we're actually feeding into it.

00:16:47.320 | So there are a few reasons for this.

00:16:49.920 | One, we don't want to create excessive texts

00:16:53.680 | that are separating the instructions and the query itself.

00:16:58.680 | Sometimes that can be distracting for the model.

00:17:02.160 | And on the other hand, we can actually add in

00:17:05.480 | so many examples that we exceed the maximum context window

00:17:09.560 | that the model allows.

00:17:10.760 | So that's basically the number of tokens

00:17:14.880 | from your query or from your prompt

00:17:17.960 | and from your generation.

00:17:21.000 | You add those back together,

00:17:22.160 | and that creates your context window.

00:17:24.640 | Every model, including the OpenAI model we're using here,

00:17:28.680 | has a maximum context window,

00:17:30.760 | and we can't exceed that,

00:17:31.600 | otherwise we're going to throw an error.

00:17:32.960 | So we definitely don't want to go over that limit.

00:17:35.080 | And another thing we might want to consider

00:17:36.880 | is that we don't want to use too many tokens

00:17:40.560 | because it costs money to run this.

00:17:42.960 | So we might also want to limit the number of examples

00:17:45.600 | we're bringing through because of that.

00:17:47.760 | And we might want to limit the number of examples

00:17:51.000 | based on how long the user's query is.

00:17:54.560 | So if the user just has like a small three-word query,

00:17:57.920 | we can include more of our examples.

00:18:01.120 | If the user is like kind of writing us

00:18:04.120 | a little bit of a poem,

00:18:05.760 | then we might want to limit the number of examples

00:18:08.560 | we're bringing through.

00:18:10.200 | And that is where we would use something like this.

00:18:12.640 | So there are a few of these,

00:18:14.720 | what we'd call example selectors.

00:18:17.200 | The most basic of those

00:18:18.880 | is called the length-based example selector.

00:18:22.160 | With a length-based example selector,

00:18:24.800 | we would feed in our list of examples,

00:18:27.080 | we'd feed in our example prompt that we created earlier,

00:18:30.880 | and then we'd also select the maximum length.

00:18:33.920 | What we're doing here anyway,

00:18:34.960 | the default setting is super simple.

00:18:37.760 | All we're doing is splitting

00:18:39.360 | based on new line characters or white space.

00:18:41.960 | So for example, with this text here,

00:18:44.880 | in this first bit, we have eight words,

00:18:48.480 | and then here we have another six words.

00:18:51.200 | So we can split based on new lines and spaces

00:18:54.400 | and we will get this, okay?

00:18:56.640 | And here is the number of words that we have there.

00:18:59.600 | That is all that this is doing.

00:19:01.200 | So when we set max length,

00:19:02.840 | that's where setting the max length

00:19:04.800 | for the number of separate tokens

00:19:06.600 | based on white space and new lines.

00:19:09.800 | So from here, we're going to initialize

00:19:12.200 | the what I'm gonna call dynamic prompt template.

00:19:15.880 | Now, this is just a dynamic version

00:19:19.120 | of our few-shot prompt template.

00:19:20.960 | So in here before, we just put in the examples.

00:19:25.320 | Okay, so we had examples equals examples.

00:19:28.120 | That's just saying feed in all the examples every time.

00:19:31.160 | This time, we've already fed in our list of examples

00:19:34.280 | to the example selector up here.

00:19:37.400 | So we can actually use this example selector

00:19:40.600 | to select from those examples,

00:19:43.680 | a certain number of them based on whatever prompt

00:19:47.240 | this few-shot prompt template will receive later on.

00:19:50.440 | So let's run this.

00:19:52.480 | And actually I need to run up here as well.

00:19:56.760 | So run this.

00:19:58.160 | And what we're gonna do is just quite a small prompt here.

00:20:01.080 | So this would be four tokens, run this.

00:20:04.480 | And we can see there are a few examples here.

00:20:06.760 | So we have four examples in total

00:20:10.840 | before we get to our final part here, right?

00:20:13.800 | And then if we wanted to run that,

00:20:15.120 | we again just pass it through OpenAI, right?

00:20:17.880 | And we get this kind of sarcastic, jokey answer.

00:20:21.920 | Now, let's try and ask a longer question.

00:20:24.800 | So this is what I mean when I'm saying occasionally

00:20:26.960 | maybe someone is going to write you a poem

00:20:29.160 | when they're querying something.

00:20:31.200 | So we have, they're kind of just rambling on, right?

00:20:34.840 | It's much longer.

00:20:35.880 | So what happens if we query with this?

00:20:38.800 | Okay, we can see straight away

00:20:42.520 | that we actually get just one example being pulled through.

00:20:45.560 | So because this is a much longer question,

00:20:48.840 | we're not including as many examples.

00:20:51.080 | And of course we can modify this

00:20:53.840 | as to what makes sense for us.

00:20:56.200 | So we can increase the max length here

00:20:59.560 | and we'll just rerun everything there.

00:21:02.120 | So we have the prompt template, we recreate it

00:21:06.040 | and then run that again with the same long question.

00:21:10.480 | Okay, here.

00:21:11.680 | And we can see that we're actually now

00:21:12.960 | including five various samples

00:21:14.400 | because we've just doubled the number of example words

00:21:17.640 | that are allowed through.

00:21:19.760 | Now, this is just a small example

00:21:22.840 | of what we can do with prompt templates.

00:21:25.360 | For example, if we wanted to use

00:21:27.480 | different example selectors, we can do.

00:21:29.800 | So I showed you the very simple

00:21:32.120 | length-based example selector here.

00:21:34.920 | But we can do what I think is better things

00:21:37.400 | with this as well.

00:21:39.000 | So we can actually base the samples

00:21:40.720 | that we include on similarity.

00:21:43.720 | So we embed our examples as vector embeddings

00:21:47.360 | and then we calculate similarity between them

00:21:49.760 | in order to, when we're asking a question,

00:21:52.600 | always try to include relevant examples

00:21:56.800 | rather than just kind of filling up with examples

00:21:59.000 | that are maybe not so relevant to the current query.

00:22:02.600 | And then there are a few other ones as well.

00:22:04.920 | This one is very new, the Ngram overlap example selector.

00:22:09.360 | And we're going to cover all of these

00:22:10.920 | at some point in a future video.

00:22:12.760 | But for now, that's it for this video.

00:22:16.920 | You know, as you've seen, we've just gone through

00:22:19.720 | the basics of prompt templates and a few short

00:22:22.520 | prompt templates with a very simple example selector.

00:22:25.760 | And for a lot of use cases,

00:22:27.040 | that's probably all you're going to need.

00:22:29.120 | So with that in mind, I'm going to leave it there

00:22:32.520 | for this video.

00:22:33.720 | So thank you very much for watching.

00:22:36.880 | I hope this has been useful and interesting

00:22:38.920 | and I will see you again in the next one, bye.

00:22:41.320 | (soft music)

00:22:43.720 | (soft music)

00:22:46.120 | (soft music)

00:22:48.520 | (soft music)

00:22:50.920 | (soft music)

00:22:53.320 | (soft music)

00:22:55.720 | (soft music)

Prompt Templates for GPT 3.5 and other LLMs - LangChain #2

Chapters