back to indexNEW GPT-4 Function Calling Model!
Chapters
0:0 New GPT-4 and ChatGPT Functions
1:13 Function Calling in OpenAI Docs
1:43 Implementation in Python
3:25 Creating a Function for GPT 4
4:37 Function Calling Instructions
6:11 Calling ChatCompletion with Functions
7:41 Using GPT 4 to Run Functions
9:25 Creating Image Generation Pipeline
11:2 Adding Image Generation to Function
13:43 Creating Product Pages with Images
15:16 Final Thoughts on OpenAI Function Calling
00:00:00.000 |
OpenAI have just released a new functionality for two new models. 00:00:05.840 |
Those models are the updated GPT-4 and GPT-3.5 models. 00:00:11.920 |
This new feature is called function calling and it essentially allows us to pass to GPT-4 00:00:19.600 |
and GPT-3.5 a description of a Python function and based on that description, 00:00:27.600 |
ask the model a question and if the model sees that this question needs to use one of the 00:00:34.080 |
function descriptions that you've passed to it, it will go ahead and return you the parameters 00:00:40.080 |
that you need in order to use that function in the way that you have described to GPT-4, GPT-3.5. 00:00:48.480 |
So it is essentially the tool usage that you would get from LineChain but directly within 00:00:57.280 |
the OpenAI API and it seems just from some initial testing to work pretty well. 00:01:04.400 |
So let's have a quick look at what this is and also dive into the code and how we can 00:01:11.280 |
actually implement this ourselves. So here we can see that describing what this function 00:01:17.200 |
calling is essentially what I just said, right? So we can describe functions to GPT-3.5, GPT-4 00:01:22.880 |
and the model will choose whether it should output a JSON object containing arguments to 00:01:29.360 |
call those functions and then this is an important thing here, so the chat completions API does not 00:01:34.800 |
call the function, it just provides you with those parameters in JSON format. 00:01:41.760 |
So let's go ahead and actually see how this works. So I'm going to be using Colab, we don't need all 00:01:51.440 |
of these prerequisite libraries just to understand how this works but I wanted to go with a kind of 00:01:56.400 |
interesting example. So later on we are going to be using an image generation model from Hugging 00:02:01.760 |
Face diffusers. So for that component of this notebook, again you don't need to get to that 00:02:09.360 |
point in order to understand this, but for that component you do need basically all of these 00:02:16.560 |
prerequisites or libraries installed and you should also make sure that you are using a CUDA-enabled 00:02:22.240 |
GPU because otherwise it's going to take a very long time. So you can switch to a GPU runtime 00:02:31.280 |
within Colab by going to runtime at the top, going to your change runtime type and making sure you 00:02:39.920 |
have GPU in here and you can use the T4 GPU on standard which I believe is available on the 00:02:47.040 |
free version of Colab as well. So let's go ahead and run this. So one thing that we will need here 00:02:53.840 |
and we will need to pay for unless you are within the free usage of OpenAI is a OpenAI API key. So 00:03:02.240 |
you can get one of those over at platform.openai.com. There'll be a link to that at the top of the video 00:03:08.400 |
right now. Yep, once you have that all you need to do is just enter your API key in here, run this 00:03:15.600 |
and you should see this and this will tell us that you are authenticated to use the OpenAI API. 00:03:24.080 |
Okay, it looks good. So what we're going to do is first create a function that we will be using 00:03:29.040 |
GPT-4 to call. Also, if you still don't have access to GPT-4, that's fine. We can also, 00:03:35.680 |
I'll show you where you can switch this out for GPT-3.5 as well and you should get pretty similar 00:03:42.240 |
results. Okay, so this function that we're going to create is a page builder for a product. So 00:03:48.960 |
imagine you want to generate a ton of product pages. You're going to use this function here 00:03:57.600 |
and what you're going to want to do is ask GPT-4 or 3.5 to give you a title and some copy text. So 00:04:08.240 |
like the marketing text for that product and we're just going to take that into our HTML code here. 00:04:14.560 |
We're going to write that to file and then just for us in the notebook, we're going to view that 00:04:20.160 |
file as well. So we run that, run it again and you should see in a moment we get this. Okay, 00:04:27.200 |
so I've just manually put in these two parameters here. So the title of the product and the copy 00:04:34.080 |
text for that product. Okay, now what we want to do is give GPT-4, GPT-3.5 instructions on how to 00:04:45.120 |
use this function. Okay, so to do that we need these three items at high level. So we have the 00:04:54.160 |
name of this function that GPT will use. It's going to be called page builder. It doesn't necessarily 00:05:01.920 |
need to align to the function name here. You can put anything out, anything you want in there, 00:05:06.720 |
but basically you'll be using this to identify which function a GPT wants you to use. 00:05:13.920 |
We need a short description, so creates product web pages. You can make that longer or shorter 00:05:21.680 |
depending on what you need and then parameters. So this is probably the most important part. 00:05:26.720 |
So these describe the inputs into your function, right? So those actual inputs are within the 00:05:38.080 |
properties here. Okay, and we have two. We have title and copy text, right? So both of these are 00:05:45.520 |
strings and I'm just describing what they actually are. Okay, so the name of the product and then 00:05:50.560 |
here marketing copy that describes and sells the product. That's all it is. And then after that, 00:05:57.920 |
so within parameters, we say what is required and that is all of the parameters. So title and 00:06:04.320 |
copy text. Okay, so this is basically the description on how GPT can use our function. 00:06:10.800 |
So with that, what we're going to do is go to the check completion endpoint. We're going to create 00:06:17.920 |
our prompt. So create web page for a new cutting edge mango cutting machine. That's what I want 00:06:23.120 |
GPT to create for us. And then as usual, we use check completion create. So these lines I've 00:06:32.640 |
highlighted here, we would always run those. Okay. The only difference now is one we're changing the 00:06:39.600 |
model. So using this GPT 4.0.6.13, if again, you don't have access to GPT 4 yet, you should use 00:06:47.200 |
this model here. So GPT 3.5 turbo 0.6.13. And then the other difference. So the difference that allows 00:06:56.560 |
us to use this function calling feature is we add functions and then we pass in a list of functions 00:07:03.120 |
that we would like GPT to be able to use. And when I say functions, I mean, these kind of function 00:07:10.560 |
descriptor items. Right. So we, let's run that and this. Okay. And in the response from GPT 4.0.6.13, 00:07:19.200 |
we can see we get all of this. So we have, what do we have here? So basically we can see in here, 00:07:27.040 |
we have the message, we have function call. Okay. So this is the function call that GPT is creating 00:07:35.760 |
for us. And we can also see that we have this new finish reason. So programmatically, how are we 00:07:42.400 |
going to identify that GPT 4 wants us to call our function? Well, we're going to come down to here. 00:07:49.360 |
We're going to say, okay, if the finish reason is equal to function call, that means we need to call 00:07:54.240 |
a function. Okay. So we run that and okay. Yes, we should call a function. And then based on that, 00:08:00.880 |
if it says we should call a function, we need to extract the name of that function. So if we have 00:08:06.800 |
multiple functions, this will be useful. So we know which function to use. And we also need to 00:08:11.760 |
extract the arguments that we should pass to that function. Okay. So we do that. Also note that this 00:08:19.600 |
is actually a string. So we convert it into a dictionary. Okay. And we get this. Right. 00:08:27.360 |
So with that, we can use the keyword arguments syntax here, and we can just pass all those 00:08:34.640 |
straight into our page builder function. Okay. We run that and we get this. So the manga master, 00:08:41.760 |
and we get some copy that is definitely better than anything I could write myself. So that is 00:08:48.640 |
pretty cool. But I think we can probably do better, especially with all these other models that are 00:08:55.920 |
available to us. Right. So this is, yeah, it's a nice text, but I don't know of any product pages 00:09:02.720 |
where it's just text. There's always images as well. So can we also get this GPT-4 or GPT-3.5 00:09:13.040 |
function to also generate an image for us? And yes, we can, just not with GPT-4 or GPT-3.5. We 00:09:20.960 |
need to use a different model. So we're going to go ahead and do that. So we're going to use 00:09:26.480 |
Stability AI's Stable Diffusion model. So this is Stable Diffusion 2.1. And this is just how 00:09:35.840 |
we initialize it. Okay. So we're using Hugging Face Diffusers here. Again, this is actually 00:09:41.040 |
where we need to use that keyword enabled GPU. You can use CPU. It's just going to take a long 00:09:45.680 |
time for it to generate the image for you. And all of this up to, all of this I've highlighted 00:09:54.480 |
is what we need in order to initialize the model or the diffusion pipeline. 00:10:00.320 |
Then here, I'm just running a test. Okay. So my test prompt is a photo of an astronaut riding a 00:10:08.320 |
horse on Mars. And then we're going to generate that image with this. And I'm going to display 00:10:15.360 |
that image to make sure it is actually working. So that will take, the first time we run this, 00:10:22.800 |
it's going to take a long time to just download the model. So you'll just have to wait a moment 00:10:27.760 |
for that. Okay. So I've just switched across to this notebook, which I've already pre-run. 00:10:32.960 |
But continuing from where we just were, after loading everything, you can just run the same 00:10:41.120 |
thing and it will be very quick. So this is 20 seconds on the T4 GPU. So if you have something 00:10:50.400 |
bigger, like an A100, it can also go faster. So this is what you get. So it's pretty decent. A 00:10:57.760 |
photo of an astronaut riding a horse on Mars works relatively well. Now, the thing that we need to do 00:11:06.000 |
now is integrate this image generation logic into the function that we were giving to GPT4 before. 00:11:15.200 |
So our page builder function. So that's actually very simple. All we're going to do is we're going 00:11:20.640 |
to provide an image description. So this is going to be the prompt to generate the image. 00:11:24.800 |
And we're going to take that. We're going to pass it into the pipe that will generate the image, 00:11:30.640 |
we extract it, and then we're going to save its file. And then we're going to use this image on 00:11:36.400 |
file as the image in our product web page. Okay. So that means we're going to have some slight 00:11:45.520 |
differences in the HTML. So I think the actual HTML here is we just add an image and a source. 00:11:53.680 |
We don't need to make this dynamic because we're just going to save the image file. So it's going 00:11:58.560 |
to be product.jpg all the time. And yeah, I mean, that's basically it. We just have some CSS to make 00:12:06.320 |
sure things are styled kind of correctly. And that's all we need. So from now we would just, 00:12:15.600 |
as before, we're going to save our index.html to file. And that is our function modified. We didn't 00:12:25.440 |
really change much. We just had that image generation step. Now, as well as modifying 00:12:31.440 |
our actual Python function, we also need to modify the function description that we're passing 00:12:37.600 |
to GPT-4 3.5. So in this, we just need to add an extra parameter into our properties. And that 00:12:47.200 |
is going to be the image description parameter. Again, it's a string. And what we do is just give 00:12:52.720 |
a description of what we want. So I want a concise description of the product image using descriptive 00:12:58.160 |
language, but also no more than two sentences long. Okay. So I actually added this in later 00:13:05.040 |
because if we don't add that in, it's going to give us a very long piece of text to describe 00:13:09.840 |
the image, which is fine, but there is a token limit in the clip model that is used by Sable 00:13:15.440 |
Diffusion of 77 tokens, which is relatively short. So it means that things have been truncated and 00:13:22.880 |
we were not getting all of the detail that we actually wanted within the image prompt. 00:13:30.080 |
So I added that little no more than two sentences long into the end there and it works well. 00:13:37.280 |
And then we just add the image description to our list of required parameters. And yeah, that's it. 00:13:44.880 |
So we're just going to, with this, we just run this whole logic again. The only difference here 00:13:52.640 |
is I've wrapped it all into a single function. So we make our request to OpenAI, we pass in our 00:13:58.160 |
page builder function description. And what it's going to do is we're going to check for that 00:14:04.240 |
finish reason. If it's function call, then we extract the name and the arguments. If the name 00:14:10.080 |
is equal to page builder, then we're going to call the page builder function with those arguments. 00:14:16.640 |
And then from there, we return the HTML of that page. And we get this. So this is actually just 00:14:25.760 |
a loading bar. It just doesn't display properly. But below that, we have our HTML here. We have 00:14:32.560 |
the title, the copy text, and then we have the image here. I couldn't find a good way of showing 00:14:39.360 |
the image from Colab. So what you actually just need to do is you go to your files over here and 00:14:46.240 |
download the index.html and product.jpg, and then just open it locally, and I will show you. 00:14:54.640 |
So if I open this, the HTML file locally, we'll get this. So we get this, I don't know, mango 00:15:01.280 |
cutting machine-like thing in the background here. We get our title and our copy there as well. 00:15:09.440 |
Which I think is kind of cool. It was super quick to put that together. And it's decent. 00:15:16.080 |
So yeah, that's it for this video. I just wanted to have a quick look at this new feature from 00:15:22.560 |
OpenAI, the function calling. Seems pretty interesting. And I suppose, realistically, 00:15:30.560 |
we're already able to do this with a line chain. But I think having it as within the 00:15:37.920 |
check completion endpoint is pretty useful for when we just want to kind of keep things 00:15:44.400 |
simple and lighter in our code. So I think that will be pretty useful. And from what I've seen, 00:15:53.200 |
it seems to work really well as well. So I think this is, I think it's pretty interesting. 00:15:59.920 |
Now, that's it for this video. I hope all this has been useful and interesting. So thank you 00:16:06.800 |
very much for watching and I will see you again in the next one. Bye.