OpenAI have just released a new functionality for two new models. Those models are the updated GPT-4 and GPT-3.5 models. This new feature is called function calling and it essentially allows us to pass to GPT-4 and GPT-3.5 a description of a Python function and based on that description, ask the model a question and if the model sees that this question needs to use one of the function descriptions that you've passed to it, it will go ahead and return you the parameters that you need in order to use that function in the way that you have described to GPT-4, GPT-3.5.
So it is essentially the tool usage that you would get from LineChain but directly within the OpenAI API and it seems just from some initial testing to work pretty well. So let's have a quick look at what this is and also dive into the code and how we can actually implement this ourselves.
So here we can see that describing what this function calling is essentially what I just said, right? So we can describe functions to GPT-3.5, GPT-4 and the model will choose whether it should output a JSON object containing arguments to call those functions and then this is an important thing here, so the chat completions API does not call the function, it just provides you with those parameters in JSON format.
So let's go ahead and actually see how this works. So I'm going to be using Colab, we don't need all of these prerequisite libraries just to understand how this works but I wanted to go with a kind of interesting example. So later on we are going to be using an image generation model from Hugging Face diffusers.
So for that component of this notebook, again you don't need to get to that point in order to understand this, but for that component you do need basically all of these prerequisites or libraries installed and you should also make sure that you are using a CUDA-enabled GPU because otherwise it's going to take a very long time.
So you can switch to a GPU runtime within Colab by going to runtime at the top, going to your change runtime type and making sure you have GPU in here and you can use the T4 GPU on standard which I believe is available on the free version of Colab as well.
So let's go ahead and run this. So one thing that we will need here and we will need to pay for unless you are within the free usage of OpenAI is a OpenAI API key. So you can get one of those over at platform.openai.com. There'll be a link to that at the top of the video right now.
Yep, once you have that all you need to do is just enter your API key in here, run this and you should see this and this will tell us that you are authenticated to use the OpenAI API. Okay, it looks good. So what we're going to do is first create a function that we will be using GPT-4 to call.
Also, if you still don't have access to GPT-4, that's fine. We can also, I'll show you where you can switch this out for GPT-3.5 as well and you should get pretty similar results. Okay, so this function that we're going to create is a page builder for a product. So imagine you want to generate a ton of product pages.
You're going to use this function here and what you're going to want to do is ask GPT-4 or 3.5 to give you a title and some copy text. So like the marketing text for that product and we're just going to take that into our HTML code here. We're going to write that to file and then just for us in the notebook, we're going to view that file as well.
So we run that, run it again and you should see in a moment we get this. Okay, so I've just manually put in these two parameters here. So the title of the product and the copy text for that product. Okay, now what we want to do is give GPT-4, GPT-3.5 instructions on how to use this function.
Okay, so to do that we need these three items at high level. So we have the name of this function that GPT will use. It's going to be called page builder. It doesn't necessarily need to align to the function name here. You can put anything out, anything you want in there, but basically you'll be using this to identify which function a GPT wants you to use.
We need a short description, so creates product web pages. You can make that longer or shorter depending on what you need and then parameters. So this is probably the most important part. So these describe the inputs into your function, right? So those actual inputs are within the properties here.
Okay, and we have two. We have title and copy text, right? So both of these are strings and I'm just describing what they actually are. Okay, so the name of the product and then here marketing copy that describes and sells the product. That's all it is. And then after that, so within parameters, we say what is required and that is all of the parameters.
So title and copy text. Okay, so this is basically the description on how GPT can use our function. So with that, what we're going to do is go to the check completion endpoint. We're going to create our prompt. So create web page for a new cutting edge mango cutting machine.
That's what I want GPT to create for us. And then as usual, we use check completion create. So these lines I've highlighted here, we would always run those. Okay. The only difference now is one we're changing the model. So using this GPT 4.0.6.13, if again, you don't have access to GPT 4 yet, you should use this model here.
So GPT 3.5 turbo 0.6.13. And then the other difference. So the difference that allows us to use this function calling feature is we add functions and then we pass in a list of functions that we would like GPT to be able to use. And when I say functions, I mean, these kind of function descriptor items.
Right. So we, let's run that and this. Okay. And in the response from GPT 4.0.6.13, we can see we get all of this. So we have, what do we have here? So basically we can see in here, we have the message, we have function call. Okay. So this is the function call that GPT is creating for us.
And we can also see that we have this new finish reason. So programmatically, how are we going to identify that GPT 4 wants us to call our function? Well, we're going to come down to here. We're going to say, okay, if the finish reason is equal to function call, that means we need to call a function.
Okay. So we run that and okay. Yes, we should call a function. And then based on that, if it says we should call a function, we need to extract the name of that function. So if we have multiple functions, this will be useful. So we know which function to use.
And we also need to extract the arguments that we should pass to that function. Okay. So we do that. Also note that this is actually a string. So we convert it into a dictionary. Okay. And we get this. Right. So with that, we can use the keyword arguments syntax here, and we can just pass all those straight into our page builder function.
Okay. We run that and we get this. So the manga master, and we get some copy that is definitely better than anything I could write myself. So that is pretty cool. But I think we can probably do better, especially with all these other models that are available to us.
Right. So this is, yeah, it's a nice text, but I don't know of any product pages where it's just text. There's always images as well. So can we also get this GPT-4 or GPT-3.5 function to also generate an image for us? And yes, we can, just not with GPT-4 or GPT-3.5.
We need to use a different model. So we're going to go ahead and do that. So we're going to use Stability AI's Stable Diffusion model. So this is Stable Diffusion 2.1. And this is just how we initialize it. Okay. So we're using Hugging Face Diffusers here. Again, this is actually where we need to use that keyword enabled GPU.
You can use CPU. It's just going to take a long time for it to generate the image for you. And all of this up to, all of this I've highlighted is what we need in order to initialize the model or the diffusion pipeline. Then here, I'm just running a test.
Okay. So my test prompt is a photo of an astronaut riding a horse on Mars. And then we're going to generate that image with this. And I'm going to display that image to make sure it is actually working. So that will take, the first time we run this, it's going to take a long time to just download the model.
So you'll just have to wait a moment for that. Okay. So I've just switched across to this notebook, which I've already pre-run. But continuing from where we just were, after loading everything, you can just run the same thing and it will be very quick. So this is 20 seconds on the T4 GPU.
So if you have something bigger, like an A100, it can also go faster. So this is what you get. So it's pretty decent. A photo of an astronaut riding a horse on Mars works relatively well. Now, the thing that we need to do now is integrate this image generation logic into the function that we were giving to GPT4 before.
So our page builder function. So that's actually very simple. All we're going to do is we're going to provide an image description. So this is going to be the prompt to generate the image. And we're going to take that. We're going to pass it into the pipe that will generate the image, we extract it, and then we're going to save its file.
And then we're going to use this image on file as the image in our product web page. Okay. So that means we're going to have some slight differences in the HTML. So I think the actual HTML here is we just add an image and a source. We don't need to make this dynamic because we're just going to save the image file.
So it's going to be product.jpg all the time. And yeah, I mean, that's basically it. We just have some CSS to make sure things are styled kind of correctly. And that's all we need. So from now we would just, as before, we're going to save our index.html to file.
And that is our function modified. We didn't really change much. We just had that image generation step. Now, as well as modifying our actual Python function, we also need to modify the function description that we're passing to GPT-4 3.5. So in this, we just need to add an extra parameter into our properties.
And that is going to be the image description parameter. Again, it's a string. And what we do is just give a description of what we want. So I want a concise description of the product image using descriptive language, but also no more than two sentences long. Okay. So I actually added this in later because if we don't add that in, it's going to give us a very long piece of text to describe the image, which is fine, but there is a token limit in the clip model that is used by Sable Diffusion of 77 tokens, which is relatively short.
So it means that things have been truncated and we were not getting all of the detail that we actually wanted within the image prompt. So I added that little no more than two sentences long into the end there and it works well. And then we just add the image description to our list of required parameters.
And yeah, that's it. So we're just going to, with this, we just run this whole logic again. The only difference here is I've wrapped it all into a single function. So we make our request to OpenAI, we pass in our page builder function description. And what it's going to do is we're going to check for that finish reason.
If it's function call, then we extract the name and the arguments. If the name is equal to page builder, then we're going to call the page builder function with those arguments. And then from there, we return the HTML of that page. And we get this. So this is actually just a loading bar.
It just doesn't display properly. But below that, we have our HTML here. We have the title, the copy text, and then we have the image here. I couldn't find a good way of showing the image from Colab. So what you actually just need to do is you go to your files over here and download the index.html and product.jpg, and then just open it locally, and I will show you.
So if I open this, the HTML file locally, we'll get this. So we get this, I don't know, mango cutting machine-like thing in the background here. We get our title and our copy there as well. Which I think is kind of cool. It was super quick to put that together.
And it's decent. So yeah, that's it for this video. I just wanted to have a quick look at this new feature from OpenAI, the function calling. Seems pretty interesting. And I suppose, realistically, we're already able to do this with a line chain. But I think having it as within the check completion endpoint is pretty useful for when we just want to kind of keep things simple and lighter in our code.
So I think that will be pretty useful. And from what I've seen, it seems to work really well as well. So I think this is, I think it's pretty interesting. Now, that's it for this video. I hope all this has been useful and interesting. So thank you very much for watching and I will see you again in the next one.
Bye.