back to index

NEW GPT-4 Function Calling Model!


Chapters

0:0 New GPT-4 and ChatGPT Functions
1:13 Function Calling in OpenAI Docs
1:43 Implementation in Python
3:25 Creating a Function for GPT 4
4:37 Function Calling Instructions
6:11 Calling ChatCompletion with Functions
7:41 Using GPT 4 to Run Functions
9:25 Creating Image Generation Pipeline
11:2 Adding Image Generation to Function
13:43 Creating Product Pages with Images
15:16 Final Thoughts on OpenAI Function Calling

Whisper Transcript | Transcript Only Page

00:00:00.000 | OpenAI have just released a new functionality for two new models.
00:00:05.840 | Those models are the updated GPT-4 and GPT-3.5 models.
00:00:11.920 | This new feature is called function calling and it essentially allows us to pass to GPT-4
00:00:19.600 | and GPT-3.5 a description of a Python function and based on that description,
00:00:27.600 | ask the model a question and if the model sees that this question needs to use one of the
00:00:34.080 | function descriptions that you've passed to it, it will go ahead and return you the parameters
00:00:40.080 | that you need in order to use that function in the way that you have described to GPT-4, GPT-3.5.
00:00:48.480 | So it is essentially the tool usage that you would get from LineChain but directly within
00:00:57.280 | the OpenAI API and it seems just from some initial testing to work pretty well.
00:01:04.400 | So let's have a quick look at what this is and also dive into the code and how we can
00:01:11.280 | actually implement this ourselves. So here we can see that describing what this function
00:01:17.200 | calling is essentially what I just said, right? So we can describe functions to GPT-3.5, GPT-4
00:01:22.880 | and the model will choose whether it should output a JSON object containing arguments to
00:01:29.360 | call those functions and then this is an important thing here, so the chat completions API does not
00:01:34.800 | call the function, it just provides you with those parameters in JSON format.
00:01:41.760 | So let's go ahead and actually see how this works. So I'm going to be using Colab, we don't need all
00:01:51.440 | of these prerequisite libraries just to understand how this works but I wanted to go with a kind of
00:01:56.400 | interesting example. So later on we are going to be using an image generation model from Hugging
00:02:01.760 | Face diffusers. So for that component of this notebook, again you don't need to get to that
00:02:09.360 | point in order to understand this, but for that component you do need basically all of these
00:02:16.560 | prerequisites or libraries installed and you should also make sure that you are using a CUDA-enabled
00:02:22.240 | GPU because otherwise it's going to take a very long time. So you can switch to a GPU runtime
00:02:31.280 | within Colab by going to runtime at the top, going to your change runtime type and making sure you
00:02:39.920 | have GPU in here and you can use the T4 GPU on standard which I believe is available on the
00:02:47.040 | free version of Colab as well. So let's go ahead and run this. So one thing that we will need here
00:02:53.840 | and we will need to pay for unless you are within the free usage of OpenAI is a OpenAI API key. So
00:03:02.240 | you can get one of those over at platform.openai.com. There'll be a link to that at the top of the video
00:03:08.400 | right now. Yep, once you have that all you need to do is just enter your API key in here, run this
00:03:15.600 | and you should see this and this will tell us that you are authenticated to use the OpenAI API.
00:03:24.080 | Okay, it looks good. So what we're going to do is first create a function that we will be using
00:03:29.040 | GPT-4 to call. Also, if you still don't have access to GPT-4, that's fine. We can also,
00:03:35.680 | I'll show you where you can switch this out for GPT-3.5 as well and you should get pretty similar
00:03:42.240 | results. Okay, so this function that we're going to create is a page builder for a product. So
00:03:48.960 | imagine you want to generate a ton of product pages. You're going to use this function here
00:03:57.600 | and what you're going to want to do is ask GPT-4 or 3.5 to give you a title and some copy text. So
00:04:08.240 | like the marketing text for that product and we're just going to take that into our HTML code here.
00:04:14.560 | We're going to write that to file and then just for us in the notebook, we're going to view that
00:04:20.160 | file as well. So we run that, run it again and you should see in a moment we get this. Okay,
00:04:27.200 | so I've just manually put in these two parameters here. So the title of the product and the copy
00:04:34.080 | text for that product. Okay, now what we want to do is give GPT-4, GPT-3.5 instructions on how to
00:04:45.120 | use this function. Okay, so to do that we need these three items at high level. So we have the
00:04:54.160 | name of this function that GPT will use. It's going to be called page builder. It doesn't necessarily
00:05:01.920 | need to align to the function name here. You can put anything out, anything you want in there,
00:05:06.720 | but basically you'll be using this to identify which function a GPT wants you to use.
00:05:13.920 | We need a short description, so creates product web pages. You can make that longer or shorter
00:05:21.680 | depending on what you need and then parameters. So this is probably the most important part.
00:05:26.720 | So these describe the inputs into your function, right? So those actual inputs are within the
00:05:38.080 | properties here. Okay, and we have two. We have title and copy text, right? So both of these are
00:05:45.520 | strings and I'm just describing what they actually are. Okay, so the name of the product and then
00:05:50.560 | here marketing copy that describes and sells the product. That's all it is. And then after that,
00:05:57.920 | so within parameters, we say what is required and that is all of the parameters. So title and
00:06:04.320 | copy text. Okay, so this is basically the description on how GPT can use our function.
00:06:10.800 | So with that, what we're going to do is go to the check completion endpoint. We're going to create
00:06:17.920 | our prompt. So create web page for a new cutting edge mango cutting machine. That's what I want
00:06:23.120 | GPT to create for us. And then as usual, we use check completion create. So these lines I've
00:06:32.640 | highlighted here, we would always run those. Okay. The only difference now is one we're changing the
00:06:39.600 | model. So using this GPT 4.0.6.13, if again, you don't have access to GPT 4 yet, you should use
00:06:47.200 | this model here. So GPT 3.5 turbo 0.6.13. And then the other difference. So the difference that allows
00:06:56.560 | us to use this function calling feature is we add functions and then we pass in a list of functions
00:07:03.120 | that we would like GPT to be able to use. And when I say functions, I mean, these kind of function
00:07:10.560 | descriptor items. Right. So we, let's run that and this. Okay. And in the response from GPT 4.0.6.13,
00:07:19.200 | we can see we get all of this. So we have, what do we have here? So basically we can see in here,
00:07:27.040 | we have the message, we have function call. Okay. So this is the function call that GPT is creating
00:07:35.760 | for us. And we can also see that we have this new finish reason. So programmatically, how are we
00:07:42.400 | going to identify that GPT 4 wants us to call our function? Well, we're going to come down to here.
00:07:49.360 | We're going to say, okay, if the finish reason is equal to function call, that means we need to call
00:07:54.240 | a function. Okay. So we run that and okay. Yes, we should call a function. And then based on that,
00:08:00.880 | if it says we should call a function, we need to extract the name of that function. So if we have
00:08:06.800 | multiple functions, this will be useful. So we know which function to use. And we also need to
00:08:11.760 | extract the arguments that we should pass to that function. Okay. So we do that. Also note that this
00:08:19.600 | is actually a string. So we convert it into a dictionary. Okay. And we get this. Right.
00:08:27.360 | So with that, we can use the keyword arguments syntax here, and we can just pass all those
00:08:34.640 | straight into our page builder function. Okay. We run that and we get this. So the manga master,
00:08:41.760 | and we get some copy that is definitely better than anything I could write myself. So that is
00:08:48.640 | pretty cool. But I think we can probably do better, especially with all these other models that are
00:08:55.920 | available to us. Right. So this is, yeah, it's a nice text, but I don't know of any product pages
00:09:02.720 | where it's just text. There's always images as well. So can we also get this GPT-4 or GPT-3.5
00:09:13.040 | function to also generate an image for us? And yes, we can, just not with GPT-4 or GPT-3.5. We
00:09:20.960 | need to use a different model. So we're going to go ahead and do that. So we're going to use
00:09:26.480 | Stability AI's Stable Diffusion model. So this is Stable Diffusion 2.1. And this is just how
00:09:35.840 | we initialize it. Okay. So we're using Hugging Face Diffusers here. Again, this is actually
00:09:41.040 | where we need to use that keyword enabled GPU. You can use CPU. It's just going to take a long
00:09:45.680 | time for it to generate the image for you. And all of this up to, all of this I've highlighted
00:09:54.480 | is what we need in order to initialize the model or the diffusion pipeline.
00:10:00.320 | Then here, I'm just running a test. Okay. So my test prompt is a photo of an astronaut riding a
00:10:08.320 | horse on Mars. And then we're going to generate that image with this. And I'm going to display
00:10:15.360 | that image to make sure it is actually working. So that will take, the first time we run this,
00:10:22.800 | it's going to take a long time to just download the model. So you'll just have to wait a moment
00:10:27.760 | for that. Okay. So I've just switched across to this notebook, which I've already pre-run.
00:10:32.960 | But continuing from where we just were, after loading everything, you can just run the same
00:10:41.120 | thing and it will be very quick. So this is 20 seconds on the T4 GPU. So if you have something
00:10:50.400 | bigger, like an A100, it can also go faster. So this is what you get. So it's pretty decent. A
00:10:57.760 | photo of an astronaut riding a horse on Mars works relatively well. Now, the thing that we need to do
00:11:06.000 | now is integrate this image generation logic into the function that we were giving to GPT4 before.
00:11:15.200 | So our page builder function. So that's actually very simple. All we're going to do is we're going
00:11:20.640 | to provide an image description. So this is going to be the prompt to generate the image.
00:11:24.800 | And we're going to take that. We're going to pass it into the pipe that will generate the image,
00:11:30.640 | we extract it, and then we're going to save its file. And then we're going to use this image on
00:11:36.400 | file as the image in our product web page. Okay. So that means we're going to have some slight
00:11:45.520 | differences in the HTML. So I think the actual HTML here is we just add an image and a source.
00:11:53.680 | We don't need to make this dynamic because we're just going to save the image file. So it's going
00:11:58.560 | to be product.jpg all the time. And yeah, I mean, that's basically it. We just have some CSS to make
00:12:06.320 | sure things are styled kind of correctly. And that's all we need. So from now we would just,
00:12:15.600 | as before, we're going to save our index.html to file. And that is our function modified. We didn't
00:12:25.440 | really change much. We just had that image generation step. Now, as well as modifying
00:12:31.440 | our actual Python function, we also need to modify the function description that we're passing
00:12:37.600 | to GPT-4 3.5. So in this, we just need to add an extra parameter into our properties. And that
00:12:47.200 | is going to be the image description parameter. Again, it's a string. And what we do is just give
00:12:52.720 | a description of what we want. So I want a concise description of the product image using descriptive
00:12:58.160 | language, but also no more than two sentences long. Okay. So I actually added this in later
00:13:05.040 | because if we don't add that in, it's going to give us a very long piece of text to describe
00:13:09.840 | the image, which is fine, but there is a token limit in the clip model that is used by Sable
00:13:15.440 | Diffusion of 77 tokens, which is relatively short. So it means that things have been truncated and
00:13:22.880 | we were not getting all of the detail that we actually wanted within the image prompt.
00:13:30.080 | So I added that little no more than two sentences long into the end there and it works well.
00:13:37.280 | And then we just add the image description to our list of required parameters. And yeah, that's it.
00:13:44.880 | So we're just going to, with this, we just run this whole logic again. The only difference here
00:13:52.640 | is I've wrapped it all into a single function. So we make our request to OpenAI, we pass in our
00:13:58.160 | page builder function description. And what it's going to do is we're going to check for that
00:14:04.240 | finish reason. If it's function call, then we extract the name and the arguments. If the name
00:14:10.080 | is equal to page builder, then we're going to call the page builder function with those arguments.
00:14:16.640 | And then from there, we return the HTML of that page. And we get this. So this is actually just
00:14:25.760 | a loading bar. It just doesn't display properly. But below that, we have our HTML here. We have
00:14:32.560 | the title, the copy text, and then we have the image here. I couldn't find a good way of showing
00:14:39.360 | the image from Colab. So what you actually just need to do is you go to your files over here and
00:14:46.240 | download the index.html and product.jpg, and then just open it locally, and I will show you.
00:14:54.640 | So if I open this, the HTML file locally, we'll get this. So we get this, I don't know, mango
00:15:01.280 | cutting machine-like thing in the background here. We get our title and our copy there as well.
00:15:09.440 | Which I think is kind of cool. It was super quick to put that together. And it's decent.
00:15:16.080 | So yeah, that's it for this video. I just wanted to have a quick look at this new feature from
00:15:22.560 | OpenAI, the function calling. Seems pretty interesting. And I suppose, realistically,
00:15:30.560 | we're already able to do this with a line chain. But I think having it as within the
00:15:37.920 | check completion endpoint is pretty useful for when we just want to kind of keep things
00:15:44.400 | simple and lighter in our code. So I think that will be pretty useful. And from what I've seen,
00:15:53.200 | it seems to work really well as well. So I think this is, I think it's pretty interesting.
00:15:59.920 | Now, that's it for this video. I hope all this has been useful and interesting. So thank you
00:16:06.800 | very much for watching and I will see you again in the next one. Bye.
00:16:10.560 | [MUSIC]