ChatGPT Plugins: Build Your Own in Python!

OpenAI have just announced a release of custom plugins for chat GPT. Now, what does that mean exactly? Well, we've come over to chat GPT, we can see on the top here, we have our model, we can go on to here and click on plugins. And you can see now that we have this extra little drop down for plugins.

And there is this line chain DOPs plugin. Now, that's the plugin that I'm going to show you how to build in this video. But I just want to show you the other plugins as well. Okay, so you come to this plugin store. And there are basically a ton of these different plugins.

These are the featured plugins. And yeah, there's quite a few there. But then you can also see the bottom you have develop your own plugin, install the unverified plugin, and so on. So you can actually develop your own. Now, what do they actually do? Well, if you've ever heard of agents, then this is a very similar concept.

If not, no worries. I haven't really spoken about it before. So we'll just do a quick one on one on one agent is so an agent in the time of large language models is kind of like a tool, right? So imagine, as a person, right, I have a complex math problem in front of me, I'm I'm not just going to rely solely on my brain, which would kind of be the equivalent of large language model.

In this case, I am probably going to get out a calculator on my phone, and I'm going to type in the calculation, right, I'm using a tool, agents are kind of like the same thing, but for large language models, and they can do very similar functions. So you can get calculator agents.

So large language models are typically not that good at performing complex calculations. So what you can do is tell the large language model, if you need to do a calculation, I want you to use this calculator tool. And to use this calculator tool, you need to write some text, which is going to be like three plus four to the power of 4.1.

And we will take that text that you wrote, we will give it to the calculator, and it will respond back to you with the calculated answer. And then the large language model can relay that back to the user. Okay, so it's literally a large language model using a tool.

That is what these plugins in chat GPT seem to be, right? I don't know exactly how it's working. Because I have no idea. It's open AI. As far as I'm aware, there isn't any documentation on how this works. At least not yet. Maybe there will be some coming soon.

I'm not sure. But nonetheless, it seems to be working very similar to an agent. So what can we do, I'm going to show you the Langchain docs plugin. So let me first disable it. And I'm going to say, I'm going to default to my typical example of how do I use LM chain in Langchain.

Okay, and we're going to ask some more complicated questions later. But I just want to start with this one because it's relatively simple, but you don't need to understand it. Anyway. Okay. Now, you can see here, as of my knowledge, cut off date September 2021. Langchain is not a specific software library or package that I'm aware of.

I'm not familiar with the term LM chain, so on and so on. Basically, it's telling you I have no idea what you're talking about. Okay, fine. Because Langchain is a new library. I'm not surprised by that whatsoever. So let's start our new chat. Let's enable the Langchain docs plugin.

And let's ask the same question again. So what is the LM chain in Langchain? All right, let's see what happens now. Okay, so it says using Langchain docs. Okay, Langchain is a Python library that facilitates the development of application using the large language models. How cool is that? Using LM's isolations may not be sufficient for creating powerful applications, so on and so on.

Introduces the concept of chains, chain LLMs together, so on and so on. Right. So it's kind of describing what LLMs are, what these chains are in Langchain, and so on. Right. And then it also gives you some links to documentation. So I can try and click this. All right, cool.

Right. So I get directed straight to that. And we can also LM's documentation. Okay, straight away, we get these sort of things. And if I go on to the next page from the previous one, we actually get this. So query LM with the LLM chain. Right. So that's pretty cool, right?

Now, how did we do that? Well, we did that with this plugin. And what I want to show you is how to actually build that and implement that yourself. So let's jump straight into that. So the first thing we're going to start with is getting a template application or template plugin application.

Now, for that, we're going to be going to the OpenAI chat GPT retrieval plugin repo, which you can see here. Now, from there, what we want to do is go over and fork this repo, so that you have your own version of this in your own GitHub. From there, once you have that you navigate to your repository, then you click on code, you want to have clone here, you copy this, and then you want to head over to a terminal window.

And what we're going to do is just get clone the repo that we just saw. Okay, so get clone chat GPT retrieval plugin. Once we've done that, we should see that we have chat GPT retrieval plugin, we navigate to that. Okay, so here we can see the directory. Essentially, all this plugin is is going to be a doc container, where we are deploying an API that we will allow chat GPT to interact with.

So let's go through and try and understand what it is actually doing. Okay, so open up the directory here, I want to mainly focus on what we have inside server here. Okay, so we have the upsert and the query. Now, what these do is this here, so we have our, let's say we have our API is going to be here.

Okay, so this is a plugin that we're working with at the moment, right? On this side, let's say this side is us, we have the upsert and we have query. Okay, those go into the API, they're our two endpoints that we're going to be interacting with. Okay, so here we have the API that we're going to be creating.

Okay, it has this upsert and this query. Behind the API, there is going to be a vector database. Okay, this is a Pinecone component. We're also going to be taking our Langtrain docs, so Langtrain Docs, and we're going to be giving our large language model access to those Langtrain docs via this whole agent thing that we have going on here.

Alright, so there's a few components here. Then we also have chat GPT down here. Alright, let's just figure out how they all go together. GPT. Alright, Langtrain docs, Pinecone, chat GPT, API. What we are going to do is we're going to take all of our Langtrain docs. We're going to run them through a Python notebook here.

We're going to download them and we're going to throw them into this upsert endpoint here. What that's going to do is it's going to come out on the other side. It's going to go to an open AI embedding model so we can how do we how do we write that?

I'm just going to put AI Embed. Right. And that is essentially converting the text in those Langtrain docs into meaningful numerical representations of that information. Okay. And they all get stored within our Pinecone Vector Database here. Right. So this Pinecone Vector Database is now a potential source of information for chat GPT down here.

Okay. But we still we still haven't made that link yet. There's no link between chat GPT and that vector database. That's where the query part comes in. So given a particular query like the earlier example is what is the LLM chain in Langtrain. We'd ask we'd ask chat GPT.

chat GPT is going to come up here. It's going to think about it. It's going to be like I have no idea what that means but I have some instructions that tell me if someone asks about line chain, I should send that query over to this thing over here.

This query endpoint. Okay. So then our API gets that question. What is the LLM chain in Langtrain? It's going to bring it over here. It's actually also going to take it into that same AI embedding model and it's going to get to what we call a query vector. Okay.

So this query vector here, this query vector then goes to Pinecone and Pinecone says, okay, I've seen things that are similar to this before. Okay. In in the line chain docs that we embedded before, there are some documents that seem kind of similar. They mentioned the LLM chain. They mentioned Langtrain chains and all these sort of things.

I'm going to return those to you. Okay. So it returns those documents. We're going to what we're going to do is return five of those at a time. Okay. So we then return those back. So through the API, return those through the request and they go back to ChatGPT.

ChatGPT now has information from Pinecone. It has information from the Langtrain docs. So it has its query. And it also has this extra information from up here, which we're going to call the context. And now it can actually answer the question. Okay. So then it gives that back to the user, the query in context and just finishes like completes the answer based on that new context that it has.

So it now can actually answer questions. That is what we're building here, right? This plugin, this API here is what enables all of this. What enables the interaction between ChatGPT and the outside world. So the first thing that we need to do is make this API accessible. We need to deploy this API.

So let's jump straight into that. So we are going to deploy it. I mean, obviously it depends on your own setup, but for the sake of simplicity, what we're going to do here is use DigitalOcean. So DigitalOcean, you have to sign up if you haven't used it before. Otherwise, log in.

Once you have signed up, you should be able to make your way to something that looks like this. You might have to click on the side or click through a few pages in order to get to this, but what we can do is we want to come to here and you can see an app that I've already created here.

That is the app that I showed you before. We're going to just recreate a new one. Okay, and we can come down to here and what we want to do is actually create a web app. So I'm going to click on create app here, apps. We're going to create a resource and source code.

We're going to be using GitHub for this. You will probably need to authenticate, so to allow DigitalOcean to access your GitHub account, and then in your repo here, you're going to need to select the repo that you've forked from OpenAI. So I've just selected Chetubty Retrieval Plugin, branch main.

The source directory is just a top-level directory. We can auto deploy as well, so anytime we make change to the main branch, it will just be redeployed. That's useful to have. Okay, so we have this Dolphin app. I think that's just kind of like the app which will contain the plugin.

Yeah, I think that all looks good. Right, and then we have environment variables. So this is where we do need to make some, we need to add some things. Right, so I'll explain these as we go through. So there are two specific environment variables that are, we need to use these for the app in general.

Okay, and these are the bearer token and the OpenAI API key. Okay, both of those are required by the app. And then there are a few more environment variables that we need to specify that we're using Pinecone as our retrieval component, which I will explain a little more later on, and to give us access to that.

So we have Datastore, that is going to be Pinecone. We're going to need a Pinecone API key. We're going to need a Pinecone environment. And we're also going to need a Pinecone index. Okay, so Pinecone index, we can name this whatever we want. Again, I'm going to explain a little more detail very soon.

So I'm just going to call mine "langchain.plugin". Right, and then everything, well not quite everything. So the bearer token, this is a secret token that is going to be used by our plugin in order to authenticate incoming requests. Right, so we're going to include the bearer token within the headers of our API requests.

OpenAI API key, that is so we can encode everything with an OpenAI embedding model. The Datastore is just saying that we'd like to use Pinecone, and again I'll explain that in a little more detail. This will give us access to Pinecone, this will specify the Pinecone environment that we need to use, and this will specify the Pinecone index.

If none of that makes sense, it's fine, I'm going to explain it as we go through that. For now, all I will do is just show you where you need to find each one of these items. So for the bearer token, as far as I know you can just put anything you want in here, but the sort of standard practice that a lot of people go with is to use a JSON Web Tokens, I think it's called.

So JWT, yeah JSON Web Tokens, and you just come down to here and you would put in something. So you know I'm not super familiar with this, so I don't know what the best practice is to do here. All I'm going to do really is kind of make something up, so I'm just going to put in something like my name, it's going to be James Spriggs or something, and I'm going to put in some other random things.

I'm just going to see what I get for the bearer token here. There is probably a best practice doing that, but I personally don't know it. Now the other thing is your OpenAI API key, you need to get that from platform. OpenAI.com, you go to your account in the top right and then you can click view API keys and you just get your API key from there.

Other things, the Pinecone API key, for that you need to head on over to app.pinecone.io. Obviously you may need to create an account, but everything is free so you shouldn't need to pay anything, at least not if you're just creating a single index here. You just come over to here, so API keys on the left.

You have your environment here, which we need, so let's just copy that now. Put that there and then we also have in here our API key, so we will just copy that and then paste that in there as well. Then once you've added all of those, you would just press save.

You should have six environment variables there, so we click next. It's good, you can change the region depending on where you are, what you're wanting to do. Cool, all right and then they'll come up with this monthly app cost. I believe this is predicted rather than specific, although I could be wrong there, but one thing that you should be aware of here is that you will hopefully get some free credits.

You can also use those, so you shouldn't need to pay anything, but don't quote me on that. I'm going to create that resource and then that will take some time to deploy. Our deployment is live now, so what we can do is begin adding data. We refer back to our earlier drawing here.

You can see we have the first literally the first set is line chain docs through this Python code to the upset of the API, so that it goes to Pinecone. That's what we now need to do. Okay, so we head over to Colab and there is this notebook here.

There'll be a link to this notebook so you can follow along, but we're not going to run through the whole thing. We are first just going to install any of the prerequisite libraries right now. That is just this and there'll be another one a little bit later. Okay, and there's this whole preparing data stage.

I don't want to go through all of this. I've actually included this in another video for which there'll be a link somewhere in the top video right now or you can just kind of follow along with the notebook if you like to, but there's a lot going on there, so let's skip ahead and what we're going to do is directly load the data set.

So the data set that we built from Hugging Face data sets. So to do that we need to pip install data sets. We then load the data set like so. Okay, and then we need to reformat this into the format that is required by the API component. Okay, so what does that format look like?

It is somewhere in this notebook here. Okay, so we're going to have these three fields id, text, and metadata. Metadata is going to contain like additional information that we might need. The text is going to contain obviously the text and then we have the unique ideas here. Right, so we need to create that format.

Let's have a look at what we have at the moment. Okay, so we have id, text, and source. So it's very similar. We have id and text but now we need metadata. So if we just have a look at one of those items maybe it'll make it clearer what we need to do.

So id, that's fine. These are unique ideas. The text, that's also fine. Problem is the source at the end here. So we need to put this, we basically need to just create a metadata dictionary and put this source inside that. So that is actually what we do here. We just rename source URL, run that.

Okay, and now we have the correct format that we need for our API. Okay, so we have id and text like we did before but now we have metadata and then in there we have the URL and the link. Okay, cool. Now we're ready to move on to that indexing part.

Right, so because we're going to be interacting with our API we do now need to include the bearer token. So this is basically how we authenticate that we are allowed to interact with this API. So that bearer token that you create earlier you need to add it in to here.

Okay, so you just put it, replace this bearer token here with your bearer token or if you want to say as an environment variable you can use this part here. Okay, I've set that for mine so then I'm going to move on to the headers. We have the authorization here and bearer token.

Right, so this is, we're going to add this to our request so that it knows that we're authorized to actually use the API. And then here we need to add in our endpoint URL. So this is for the previous one that I created. Now I need to update this to the new one which is just here.

Right, so this Dolphin app, we can open it. So if I open this we're going to come through to this page here where it just says detail not found. That's fine, that's kind of expected. So copy that web page and we just paste it into there. Okay, cool. So what are we doing here?

Let me walk you through that. So here we're setting up a, basically what we could do if I want to kind of explain this in the simplest way possible is we could actually just remove all of this, pretend it isn't here, and we could just put requests.post like that.

Right, and that would post everything so we'd be going through all of our documents in batches of 100 and we'll be sending them to our API. Right, the reason that we have this other part, this extra little bit here, is every now and again what we might see is we might see an error when we send our request.

Okay, and we don't want to just cancel everything if we see one error one time because usually it's just like a brief error then you send the request again and it's going to be fine. That's what we're setting up here. We're saying we want to retry the request a total of five times and what we want to do so we don't overload, so we're not just sending request after request super quickly, we also have this backoff factor so it's basically it's going to wait like a tenth of a second then it's going to wait a fifth of a second then a second and then a few seconds, something like that.

And then the the status codes that we will do this for are all these, so these are like internal server errors. Right, that's all we're doing here and then we're setting that up so that our post requests use this retry strategy. That's all that extra code is doing there.

Okay, so we can go ahead and run that and this is just going to send all of our documents, going to put them in the API. What we've done here, tqdm, so let's need to import that. So from tqdm.auto import tqdm. Okay, so this is going to send all those documents to our API.

Our API is going to send those to the OpenAI embedding model, so that is text embedding auto 002 and then it's going to get those embeddings. Those embeddings are going to be then stored into the PyConvector database that we can then refer to at a later point when we're making our queries through the trap GPT plugin.

Okay, so I'm going to get high latency here because I think my codelab environment is probably running somewhere in the US and I kind of set this up to be in Singapore, which wasn't very intelligent on my part. So I'm going to go and wait for that and we'll come back when it's ready.

Okay, so that has just finished. What I am doing is switching to, I think it went to San Francisco in the end. Yeah, so I just created a new app, switched the location and obviously it ran much quicker, so we got like two minutes there instead. Okay, that will also mean, hopefully, I would expect that this will mean it will run faster with chat GPT as well.

Okay, cool, so with that we're ready to move on to querying. So this is basically what we are going to be implementing inside chat GPT. Okay, so here are a few examples. We have our queries, we're going to have a query, what is the LM chain in langchain, how do I use pinecone in langchain, what is the difference between knowledge graph memory and buffer memory for conversational memory.

All right, so if we run that we should hopefully see that we get a 200 response. Cool, and then we can print out and we can see, you know, what is actually being returned here. So see what is the LM chain in langchain and we get like these documents from those web pages.

Okay, so we have these are increasing orders complexity, LM some prompts, we have chains and then another one is talking about chains again. How do I use pinecone in langchain? We actually get this really short, it almost feels sarcastic. It's just like from langchain vector stores import pinecone. Yeah, that is true.

We would hope for a little bit more information but it works. So pinecone contents and then here it says this page goes how to use pinecone ecosystem within langchain. So that should hopefully give us a little more information and then we come down to here, what's the difference between knowledge graph memory and buffer memory conversational memory.

Then we get conversation memory, graph memory contents and talks about the knowledge graph memory and then I'm hoping at some point it might talk about buffer memory and so on and so on. So that all looks pretty good. Now how do we integrate this into chatGPT? So we'll just jump straight into chatGPT and we're going to try and add it as is without any modifications.

So first I'm going to do is just, I'm going to uninstall this one. So I'm going to go to my unverified plugins. I'm going to click uninstall. Okay and to create our own plugin we need to go to develop your own plugin. You say my manifest is ready. Let's just, it isn't but let's say it is.

Okay please provide the domain of your website where the AI plugin json file is hosted. Okay so for that we just need to copy our web address here and we place it in here. Okay so find manifest file and we're going to see okay there's some errors in our manifest.

What does it say? Manifest API URL yourappurl.com. Okay it says it is not under the root domain lobster app so on and so on. Okay and we can see all this. So okay let's try and figure out what is actually going on there. Okay so this here is our, we saw the main.py before.

Right this is our directory for our app. We go to this well known and we can see that we have this AI plugin.json and in here we have all this information about our app. Right so you can see here we have that default web address here. We need to fix that.

So I'm just going to copy this and I'm going to put it in here. And there are also a few other things that we need to update. So we have the name for the model and name for the human. Okay so the name for the model is a name of this app that the the model JackGPT will see.

So we're going to change this to langchain.db database and the name for the human will just be langchain dots. Right and then the description for the model. Now this is important. It is you know it is prompt engineering right. This is the description that the model is going to use when deciding if to use this tool or not.

Okay so we need to specify when the model should use this tool and when should it use this tool. Well we should say use this tool to get up-to-date information about the langchain Python library. Okay. Now that is pretty explicit. We're saying when to use the tool. Sometimes these tools can get used too much by large language models.

Okay so I also just want to be super careful and say do not use this tool if the user did not ask about langchain. Okay and that's it. That's all I want to put in there. As we'll see later on this isn't the only information that JackGPT gets about this tool and we'll explain that soon.

So the description for the human. What should this be? I want to say basically the same as what I said to the model but a bit shorter and nicer. I want to say up-to-date information about the langchain Python library. That's it. Okay so that should all be good. We're going to save that and I'm just going to come to the git repo again.

Git status. Okay I'm going to add that. I said the change we just made and I'm just going to commit it to the repo. Okay so now if we just go over to .wellknown.ai.plugin.json we can actually see the manifest file that we just spoke about. Okay and we can see that at the moment it hasn't been updated.

That's because it takes a little bit of time for this change to propagate through everything and we can see here that it is building the service again because I just pushed a new deployment. All right so that is just going to take a moment to finish so we'll just skip ahead to that.

Okay so the deployment has just gone live so if we come to here refresh we should see those changes in there. Okay looks good. Now let's try again with this. So we're going to try and refetch the manifest. Okay the manifest has been validated now. Looks good we can see you have lang chain docs, up-to-date information about the lang chain python library.

We have the OpenAI logo there. We can actually change that if we want to by just switching the logo that we can see here but for this demo it's fine. So click done. Okay we click over here we go to plugin store we say install a unverified plugin and this is where we need to take our url again.

We paste that into here click find plugin. They're going to give us this little warning and then we also need to enter your HTTP access token below. Now this is going to be our bearer token that we we created earlier so we need to get that bearer token yet again.

Okay and then we can install the plugin. Okay great so that is now that's now activated so we can go ahead and we can say what is the LLM chain in lang chain and we're going to see this. Okay could not pass OpenAPI spec for plugin. Okay so we're seeing this kind of weird error.

Now as far as I can tell this is just an issue with the current version of the OpenAPI spec that is currently being used so we're just going to go in and fix that. Okay so we have OpenAPI.yaml here we go into here first okay straight away we can see okay the URL isn't correct right but there are as far as I can tell there are a few other issues with this so I'm just going to go in and we're going to fix those.

Okay so I've just copy and pasted this new OpenAPI.yaml in this one works for me there'll be a link to this in the video and at the top right now in case you find that the previous one also doesn't work for you you can just use this one so there are a few things that we should change here like the URL the description so on.

Okay so we're going to say this API lets you search through the langchain-python-library.docs documentation we should call it retrieval plugin no I don't think we should call it or is it langchain-docs-api maybe and for the URL the URL should be the one that we used before so we actually have it in the AI plugin so we're just going to take this and we're going to put it here.

Okay cool so I've tested this a little bit and it seems that ChatGPT is actually also reading this OpenAPI.yaml and it seems to be using that in order to inform how it's creating the the queries to this. If you're having problems with the query you should update it so that it knows what sort of format it should be using.

Okay so there's one thing that I would like to update here which is this description. Okay it's quite long at the moment so I'm just going to shorten it a little so I'm saying this is the description that is countering it how to use a query request and we just say this is an array of search query objects each containing a natural language query string query and an optional metadata filter if you if you'd like to include that and it describes how to use the filters and so on.

Right okay that should be all we need so I'm going to make those changes so git status, git add and this is just update the OpenAPI.yaml and then we just push that. Okay so that is going to build again we'll wait for it to deploy and we can also check over here for the the yaml file so it's OpenAPI.yaml.

Okay we can see that we still have the the old version there so this we'll just wait a moment it will update soon. Okay once it's deployed we can refresh this page everything will be there now and okay so let's use what is the llm chain in lang chain.

I'm going to start with this one here we need to make sure we're using the plugins make sure our plugin is active over here and search. Okay cool. Okay this is more specifically talking about chains in general so we can probably modify this to try and return more items but for now we'll leave it there and let's just try and ask how do I use pinecone in line chain see what this comes up with.

In line chain use pinecones a vector sort of task such as semantic search or example selection that's kind of cool they got both of those to use pinecone line chain you need to install pinecone python sdk and import it from here so install pinecone client from this use this.

Okay it looks pretty good we can also click here for the pinecone documentation which also takes us through a little bit more but I want to ask one more question so new chat and I'm going to say what is the difference between knowledge graph memory and buffer memory for conversational memory in line chain we should specify.

Okay let's see this is a hard question I'm not really expecting too much here but we can see. Okay different types of memory two of which are conversation knowledge graph memory and conversation buffer memory okay let me let generate and then I'll come back to it. Okay conversational knowledge graph memory uses knowledge graph to recreate memory you know it's actually a pretty good summary for example if the user says I say hi to Sam and then ask who is Sam the user can use a knowledge graph to remember that Sam is a friend okay that's kind of cool.

Conversational buffer memory keeps track of a sequence of chat messages okay that seems pretty good in summary conversation knowledge graph memory uses a knowledge graph structure and organized information in the conversation while conversation buffer memory maintains a sequence of chat messages for short-term memory both types of memory used to provide context-aware responses based on previous interactions with the user I think that's all actually pretty accurate and I'm kind of surprised it got that to that degree that that isn't that isn't too bad at all.

Okay so that's it for this video exploring the new plugins for chat GPT there's quite a lot to cover there to be honest it's not it's not it's not super straightforward process because obviously there's a lot of different components to this that we need to consider but that being said given what you can do with this I think it's actually not as complicated as you you might expect so I think particularly as people get more used to chat GPT plugins like it's brand new so there's obviously going to be some things that need ironing out and and best practices that need figuring out and so on but I think once people get a little more used to it this sort of thing will become a lot more natural especially for people that are kind of developing new things all the time.

So anyway that's it for this video I hope all this has been interesting and useful so thank you very much for watching and I will see you again in the next one. Bye. you (soft music)

ChatGPT Plugins: Build Your Own in Python!

Chapters

Transcript