back to index

ChatGPT Plugins: Build Your Own in Python!


Chapters

0:0 ChatGPT Plugins
0:53 Plugins or LLM agents?
2:46 First look at ChatGPT plugins
5:2 Using the chatgpt-retrieval-plugin
6:45 How the plugin works
12:6 Deploying the plugin with digital ocean
15:15 ChatGPT retrieval plugin environment variables
18:42 Adding langchain docs to chatgpt plugin
26:2 Querying the chatgpt retrieval plugin
27:46 Adding the plugin to ChatGPT
28:52 Setup for ChatGPT plugin manifest file
32:47 Install unverified plugins on ChatGPT
33:41 Handling OpenAPI spec error
37:4 Asking ChatGPT plugin questions
39:45 Final thoughts on ChatGPT plugins

Whisper Transcript | Transcript Only Page

00:00:00.000 | OpenAI have just announced a release of custom plugins for
00:00:04.320 | chat GPT. Now, what does that mean exactly? Well, we've come
00:00:09.720 | over to chat GPT, we can see on the top here, we have our model,
00:00:14.600 | we can go on to here and click on plugins. And you can see now
00:00:19.160 | that we have this extra little drop down for plugins. And there
00:00:22.880 | is this line chain DOPs plugin. Now, that's the plugin that I'm
00:00:26.880 | going to show you how to build in this video. But I just want
00:00:30.520 | to show you the other plugins as well. Okay, so you come to this
00:00:33.200 | plugin store. And there are basically a ton of these
00:00:37.280 | different plugins. These are the featured plugins. And yeah,
00:00:41.800 | there's quite a few there. But then you can also see the bottom
00:00:44.560 | you have develop your own plugin, install the unverified
00:00:47.360 | plugin, and so on. So you can actually develop your own. Now,
00:00:51.160 | what do they actually do? Well, if you've ever heard of agents,
00:00:56.040 | then this is a very similar concept. If not, no worries. I
00:01:01.040 | haven't really spoken about it before. So we'll just do a quick
00:01:04.400 | one on one on one agent is so an agent in the time of large
00:01:07.880 | language models is kind of like a tool, right? So imagine, as a
00:01:12.920 | person, right, I have a complex math problem in front of me, I'm
00:01:18.440 | I'm not just going to rely solely on my brain, which would
00:01:21.920 | kind of be the equivalent of large language model. In this
00:01:24.320 | case, I am probably going to get out a calculator on my phone,
00:01:28.560 | and I'm going to type in the calculation, right, I'm using a
00:01:31.960 | tool, agents are kind of like the same thing, but for large
00:01:34.880 | language models, and they can do very similar functions. So you
00:01:38.440 | can get calculator agents. So large language models are
00:01:43.040 | typically not that good at performing complex calculations.
00:01:48.040 | So what you can do is tell the large language model, if you
00:01:52.680 | need to do a calculation, I want you to use this calculator
00:01:56.160 | tool. And to use this calculator tool, you need to write some
00:01:58.880 | text, which is going to be like three plus four to the power of
00:02:02.280 | 4.1. And we will take that text that you wrote, we will give it
00:02:06.440 | to the calculator, and it will respond back to you with the
00:02:10.480 | calculated answer. And then the large language model can relay
00:02:14.160 | that back to the user. Okay, so it's literally a large language
00:02:18.400 | model using a tool. That is what these plugins in chat GPT seem
00:02:24.400 | to be, right? I don't know exactly how it's working.
00:02:27.680 | Because I have no idea. It's open AI. As far as I'm aware,
00:02:32.680 | there isn't any documentation on how this works. At least not
00:02:36.600 | yet. Maybe there will be some coming soon. I'm not sure. But
00:02:41.800 | nonetheless, it seems to be working very similar to an
00:02:45.400 | agent. So what can we do, I'm going to show you the Langchain
00:02:49.640 | docs plugin. So let me first disable it. And I'm going to
00:02:56.040 | say, I'm going to default to my typical example of how do I use
00:03:04.400 | LM chain in Langchain. Okay, and we're going to ask some more
00:03:09.120 | complicated questions later. But I just want to start with this
00:03:11.840 | one because it's relatively simple, but you don't need to
00:03:14.760 | understand it. Anyway. Okay. Now, you can see here, as of my
00:03:23.040 | knowledge, cut off date September 2021. Langchain is not
00:03:26.680 | a specific software library or package that I'm aware of. I'm
00:03:29.200 | not familiar with the term LM chain, so on and so on.
00:03:33.360 | Basically, it's telling you I have no idea what you're
00:03:35.560 | talking about. Okay, fine. Because Langchain is a new
00:03:39.040 | library. I'm not surprised by that whatsoever. So let's start
00:03:44.600 | our new chat. Let's enable the Langchain docs plugin. And
00:03:49.880 | let's ask the same question again. So what is the LM chain
00:03:57.440 | in Langchain? All right, let's see what happens now. Okay, so
00:04:06.200 | it says using Langchain docs. Okay, Langchain is a Python
00:04:09.720 | library that facilitates the development of application using
00:04:12.800 | the large language models. How cool is that? Using LM's
00:04:16.480 | isolations may not be sufficient for creating powerful
00:04:20.840 | applications, so on and so on. Introduces the concept of
00:04:24.840 | chains, chain LLMs together, so on and so on. Right. So it's
00:04:30.880 | kind of describing what LLMs are, what these chains are in
00:04:36.280 | Langchain, and so on. Right. And then it also gives you some
00:04:40.520 | links to documentation. So I can try and click this. All right,
00:04:44.960 | cool. Right. So I get directed straight to that. And we can
00:04:49.560 | also LM's documentation. Okay, straight away, we get these
00:04:54.640 | sort of things. And if I go on to the next page from the
00:04:57.720 | previous one, we actually get this. So query LM with the LLM
00:05:01.720 | chain. Right. So that's pretty cool, right? Now, how did we do
00:05:08.320 | that? Well, we did that with this plugin. And what I want to
00:05:11.680 | show you is how to actually build that and implement that
00:05:15.400 | yourself. So let's jump straight into that. So the first thing
00:05:21.280 | we're going to start with is getting a template application
00:05:26.480 | or template plugin application. Now, for that, we're going to be
00:05:31.160 | going to the OpenAI chat GPT retrieval plugin repo, which you
00:05:36.920 | can see here. Now, from there, what we want to do is go over
00:05:42.560 | and fork this repo, so that you have your own version of this in
00:05:47.880 | your own GitHub. From there, once you have that you navigate
00:05:51.960 | to your repository, then you click on code, you want to have
00:05:57.120 | clone here, you copy this, and then you want to head over to a
00:06:00.720 | terminal window. And what we're going to do is just get clone
00:06:03.840 | the repo that we just saw. Okay, so get clone chat GPT retrieval
00:06:09.960 | plugin. Once we've done that, we should see that we have chat
00:06:14.080 | GPT retrieval plugin, we navigate to that. Okay, so here
00:06:24.120 | we can see the directory. Essentially, all this plugin is
00:06:28.280 | is going to be a doc container, where we are deploying an
00:06:33.640 | API that we will allow chat GPT to interact with. So let's go
00:06:41.840 | through and try and understand what it is actually doing. Okay,
00:06:45.440 | so open up the directory here, I want to mainly focus on what we
00:06:52.120 | have inside server here. Okay, so we have the upsert and the
00:07:01.200 | query. Now, what these do is this here, so we have our, let's
00:07:08.320 | say we have our API is going to be here. Okay, so this is a
00:07:13.080 | plugin that we're working with at the moment, right? On this
00:07:19.200 | side, let's say this side is us, we have the upsert and we have
00:07:26.320 | query. Okay, those go into the API, they're our two endpoints
00:07:33.360 | that we're going to be interacting with. Okay, so here
00:07:36.240 | we have the API that we're going to be creating. Okay, it has
00:07:40.080 | this upsert and this query. Behind the API, there is going
00:07:44.240 | to be a vector database. Okay, this is a Pinecone component.
00:07:50.320 | We're also going to be taking our Langtrain docs, so
00:07:55.200 | Langtrain
00:07:57.760 | Docs, and we're going to be giving our large language model
00:08:03.940 | access to those Langtrain docs via this whole agent thing that
00:08:08.420 | we have going on here. Alright, so there's a few components
00:08:10.900 | here. Then we also have chat GPT down here. Alright, let's
00:08:15.540 | just figure out how they all go together. GPT. Alright,
00:08:22.020 | Langtrain docs, Pinecone, chat GPT, API. What we are going to
00:08:25.940 | do is we're going to take all of our Langtrain docs. We're
00:08:28.740 | going to run them through a Python notebook here. We're
00:08:32.500 | going to download them and we're going to throw them into
00:08:37.300 | this upsert endpoint here. What that's going to do is it's
00:08:41.620 | going to come out on the other side. It's going to go to an
00:08:44.820 | open AI embedding model so we can how do we how do we write
00:08:50.740 | that? I'm just going to put AI Embed.
00:08:55.860 | Right. And that is essentially converting the text in those
00:09:02.840 | Langtrain docs into meaningful numerical representations of
00:09:08.840 | that information. Okay. And they all get stored within our
00:09:12.760 | Pinecone Vector Database here. Right. So this Pinecone Vector
00:09:16.920 | Database is now a potential source of information for chat
00:09:22.440 | GPT down here. Okay. But we still we still haven't made
00:09:27.480 | that link yet. There's no link between chat GPT and that
00:09:30.280 | vector database. That's where the query part comes in. So
00:09:35.640 | given a particular query like the earlier example is what is
00:09:40.600 | the LLM chain in Langtrain. We'd ask we'd ask chat GPT.
00:09:45.560 | chat GPT is going to come up here. It's going to think about
00:09:48.440 | it. It's going to be like I have no idea what that means but I
00:09:53.320 | have some instructions that tell me if someone asks about
00:09:56.840 | line chain, I should send that query over to this thing over
00:10:03.240 | here. This query endpoint. Okay. So then our API gets that
00:10:10.840 | question. What is the LLM chain in Langtrain? It's going to
00:10:16.120 | bring it over here. It's actually also going to take it
00:10:18.360 | into that same AI embedding model and it's going to get to
00:10:21.480 | what we call a query vector. Okay. So this query vector
00:10:25.000 | here, this query vector then goes to Pinecone and Pinecone
00:10:30.520 | says, okay, I've seen things that are similar to this
00:10:34.920 | before. Okay. In in the line chain docs that we embedded
00:10:37.960 | before, there are some documents that seem kind of
00:10:40.680 | similar. They mentioned the LLM chain. They mentioned
00:10:43.800 | Langtrain chains and all these sort of things. I'm going to
00:10:47.400 | return those to you. Okay. So it returns those documents.
00:10:51.800 | We're going to what we're going to do is return five of those
00:10:54.680 | at a time. Okay. So we then return those back. So through
00:11:00.120 | the API, return those through the request and they go back to
00:11:05.400 | ChatGPT. ChatGPT now has information from Pinecone. It
00:11:10.280 | has information from the Langtrain docs. So it has its
00:11:14.760 | query.
00:11:17.240 | And it also has this extra information from up here,
00:11:21.860 | which we're going to call the context. And now it can
00:11:27.300 | actually answer the question. Okay. So then it gives that
00:11:31.220 | back to the user, the query in context and just finishes like
00:11:36.260 | completes the answer based on that new context that it has.
00:11:40.100 | So it now can actually answer questions. That is what we're
00:11:44.820 | building here, right? This plugin, this API here is
00:11:50.660 | what enables all of this. What enables the interaction
00:11:54.260 | between ChatGPT and the outside world. So the first thing that
00:11:57.860 | we need to do is make this API accessible. We need to deploy
00:12:02.260 | this API. So let's jump straight into that. So we are
00:12:06.820 | going to deploy it. I mean, obviously it depends on your
00:12:10.820 | own setup, but for the sake of simplicity, what we're going to
00:12:15.780 | do here is use DigitalOcean. So DigitalOcean, you have to
00:12:20.100 | sign up if you haven't used it before. Otherwise, log in. Once
00:12:23.620 | you have signed up, you should be able to make your way to
00:12:26.740 | something that looks like this. You might have to click on the
00:12:29.220 | side or click through a few pages in order to get to this,
00:12:33.700 | but what we can do is we want to come to here and you can see
00:12:39.060 | an app that I've already created here. That is the app
00:12:42.260 | that I showed you before. We're going to just recreate a new
00:12:45.700 | one. Okay, and we can come down to here and what we want to do
00:12:51.620 | is actually create a web app. So I'm going to click on create
00:12:56.500 | app here, apps. We're going to create a resource and source
00:13:00.660 | code. We're going to be using GitHub for this. You will
00:13:02.740 | probably need to authenticate, so to allow DigitalOcean to
00:13:07.300 | access your GitHub account, and then in your repo here, you're
00:13:10.820 | going to need to select the repo that you've forked from
00:13:14.900 | OpenAI. So I've just selected Chetubty Retrieval Plugin,
00:13:19.700 | branch main. The source directory is just a top-level
00:13:23.060 | directory. We can auto deploy as well, so anytime we make
00:13:26.820 | change to the main branch, it will just be redeployed.
00:13:32.180 | That's useful to have. Okay, so we have this Dolphin app. I
00:13:39.380 | think that's just kind of like the app which will contain the
00:13:43.940 | plugin. Yeah, I think that all looks good. Right, and then we
00:13:50.420 | have environment variables. So this is where we do need to
00:13:53.860 | make some, we need to add some things. Right, so I'll explain
00:13:59.620 | these as we go through. So there are two specific
00:14:04.900 | environment variables that are, we need to use these for the
00:14:09.540 | app in general. Okay, and these are the bearer token and the
00:14:17.700 | OpenAI API key. Okay, both of those are required by the app.
00:14:25.940 | And then there are a few more environment variables that we
00:14:29.540 | need to specify that we're using Pinecone as our
00:14:35.300 | retrieval component, which I will explain a little more later
00:14:40.260 | on, and to give us access to that. So we have Datastore,
00:14:44.820 | that is going to be Pinecone. We're going to need a Pinecone
00:14:48.660 | API key. We're going to need a Pinecone environment.
00:14:54.260 | And we're also going to need a Pinecone index.
00:14:57.540 | Okay, so Pinecone index, we can name this whatever we want.
00:15:02.100 | Again, I'm going to explain a little more detail very soon.
00:15:06.100 | So I'm just going to call mine "langchain.plugin".
00:15:11.700 | Right, and then everything, well not quite everything.
00:15:15.140 | So the bearer token, this is a secret token that is going to
00:15:20.260 | be used by our plugin in order to authenticate incoming
00:15:23.860 | requests. Right, so we're going to include the bearer token
00:15:27.140 | within the headers of our API requests. OpenAI API key, that is
00:15:33.700 | so we can encode everything with an OpenAI
00:15:38.820 | embedding model. The Datastore is just saying
00:15:42.500 | that we'd like to use Pinecone, and again I'll explain that in a little more
00:15:45.460 | detail. This will give us access to Pinecone,
00:15:48.420 | this will specify the Pinecone environment that we need to use,
00:15:52.900 | and this will specify the Pinecone index. If none of that makes sense, it's fine, I'm
00:15:57.300 | going to explain it as we go through that. For now, all I
00:16:01.620 | will do is just show you where you need to find each one of these
00:16:04.580 | items. So for the bearer token, as far as I know you can just put anything
00:16:08.260 | you want in here, but the sort of standard practice that a
00:16:12.500 | lot of people go with is to use a JSON Web Tokens, I think it's
00:16:17.300 | called. So JWT, yeah JSON Web Tokens,
00:16:21.620 | and you just come down to here and you would put in something.
00:16:26.820 | So you know I'm not super familiar with this,
00:16:31.220 | so I don't know what the best practice is to do here.
00:16:35.380 | All I'm going to do really is kind of make something up, so I'm just going to
00:16:39.460 | put in something like my name, it's going to be James Spriggs or
00:16:45.060 | something, and I'm going to put in some other
00:16:47.380 | random things. I'm just going to see what I get for the
00:16:50.900 | bearer token here. There is probably a best practice doing
00:16:54.020 | that, but I personally don't know it. Now the
00:16:57.780 | other thing is your OpenAI API key, you need to get
00:17:00.900 | that from platform.
00:17:03.780 | OpenAI.com, you go to your account in the top right
00:17:08.100 | and then you can click view API keys and you just get your API key from there.
00:17:13.220 | Other things, the Pinecone API key, for that you need to
00:17:16.820 | head on over to app.pinecone.io. Obviously you may need to create an
00:17:22.180 | account, but everything is free so you shouldn't
00:17:26.180 | need to pay anything, at least not if you're just creating a
00:17:28.500 | single index here. You just come over to here, so API keys
00:17:32.660 | on the left. You have your environment here, which we need, so let's just copy
00:17:36.980 | that now. Put that there and then we also have in
00:17:41.060 | here our API key, so we will just copy that and then paste
00:17:44.740 | that in there as well. Then once you've added all of those,
00:17:48.420 | you would just press save. You should have six environment
00:17:52.020 | variables there, so we click next. It's good, you can
00:17:56.180 | change the region depending on where you
00:17:59.540 | are, what you're wanting to do.
00:18:03.700 | Cool, all right and then they'll come up with this monthly app cost.
00:18:08.500 | I believe this is predicted rather than specific,
00:18:11.780 | although I could be wrong there, but one thing
00:18:14.900 | that you should be aware of here is that you will
00:18:19.060 | hopefully get some free credits. You can
00:18:22.420 | also use those, so you shouldn't need to pay anything, but
00:18:26.020 | don't quote me on that. I'm going to create that resource
00:18:31.300 | and then that will take some time to deploy. Our
00:18:40.020 | deployment is live now, so what we can do
00:18:44.420 | is begin adding data. We refer back to our
00:18:48.100 | earlier drawing here. You can see we have
00:18:51.620 | the first literally the first set is line chain docs
00:18:56.020 | through this Python code to the upset of the API, so that it goes to Pinecone.
00:19:01.380 | That's what we now need to do. Okay, so we head over to Colab
00:19:05.780 | and there is this notebook here. There'll be a link to this notebook so
00:19:10.820 | you can follow along, but we're not going to run through the
00:19:13.860 | whole thing. We are first just going to install any of the prerequisite libraries
00:19:19.300 | right now. That is just this and there'll be another one a little bit
00:19:24.100 | later. Okay, and there's this whole preparing
00:19:27.060 | data stage. I don't want to go through all of this.
00:19:30.020 | I've actually included this in another video
00:19:32.900 | for which there'll be a link somewhere in the top video right now
00:19:38.420 | or you can just kind of follow along with the notebook if you like to,
00:19:42.820 | but there's a lot going on there, so let's skip ahead
00:19:46.500 | and what we're going to do is directly load the data set.
00:19:49.700 | So the data set that we built from Hugging Face data sets.
00:19:54.020 | So to do that we need to pip install data sets.
00:19:57.460 | We then load the data set like so.
00:20:01.700 | Okay, and then we need to reformat this into the format that is required by the
00:20:06.500 | API component. Okay, so what does that format look
00:20:09.700 | like? It is somewhere in this notebook here.
00:20:13.620 | Okay, so we're going to have these three fields id,
00:20:17.220 | text, and metadata. Metadata is going to contain
00:20:21.380 | like additional information that we might need. The text is going to contain
00:20:25.140 | obviously the text and then we have the unique ideas here.
00:20:28.500 | Right, so we need to create that format. Let's have a look at what we have at the
00:20:32.740 | moment. Okay, so we have id, text, and source.
00:20:37.220 | So it's very similar. We have id and text but now we need metadata.
00:20:41.940 | So if we just have a look at one of those items
00:20:46.500 | maybe it'll make it clearer what we need to do.
00:20:49.460 | So id, that's fine. These are unique ideas. The text,
00:20:52.900 | that's also fine. Problem is the source at the end here.
00:20:56.180 | So we need to put this, we basically need to just
00:20:59.380 | create a metadata dictionary and put this source
00:21:02.980 | inside that. So that is actually what we do here.
00:21:06.740 | We just rename source URL, run that.
00:21:12.420 | Okay, and now we have the correct format that we need
00:21:16.020 | for our API. Okay, so we have id and text like we did before but now we have
00:21:21.060 | metadata and then in there we have the URL
00:21:23.780 | and the link. Okay, cool. Now we're ready to move on to that
00:21:28.820 | indexing part. Right, so because we're going to be
00:21:33.700 | interacting with our API we do now need to include the bearer token.
00:21:39.540 | So this is basically how we authenticate that we are
00:21:42.900 | allowed to interact with this API. So that bearer token that you create
00:21:46.820 | earlier you need to add it in to here. Okay, so you just put it,
00:21:51.060 | replace this bearer token here with your bearer token or if you want to say as an
00:21:55.140 | environment variable you can use this part here.
00:21:58.100 | Okay, I've set that for mine so then I'm going to move on to the headers.
00:22:03.220 | We have the authorization here and bearer token.
00:22:07.300 | Right, so this is, we're going to add this to our request
00:22:10.740 | so that it knows that we're authorized to actually use the API.
00:22:14.340 | And then here we need to add in our endpoint URL. So this is for the
00:22:19.140 | previous one that I created. Now I need to
00:22:22.500 | update this to the new one which is just here. Right, so this Dolphin app, we
00:22:29.220 | can open it.
00:22:32.420 | So if I open this we're going to come through to this page here where it just
00:22:36.500 | says detail not found. That's fine, that's kind of expected.
00:22:41.940 | So copy that web page and we just paste it into there. Okay,
00:22:49.220 | cool. So what are we doing here? Let me walk you through that.
00:22:56.580 | So here we're setting up a, basically what we could do
00:23:03.140 | if I want to kind of explain this in the simplest way possible
00:23:06.660 | is we could actually just remove all of this, pretend it isn't here,
00:23:10.740 | and we could just put requests.post like that. Right, and that would
00:23:17.540 | post everything so we'd be going through all of our documents
00:23:21.460 | in batches of 100 and we'll be sending them to our API.
00:23:26.100 | Right, the reason that we have this other part,
00:23:29.380 | this extra little bit here, is every now and again what we might see
00:23:34.260 | is we might see an error when we send our request. Okay,
00:23:38.020 | and we don't want to just cancel everything if we see one error one time
00:23:42.740 | because usually it's just like a brief error
00:23:44.980 | then you send the request again and it's going to be fine.
00:23:48.580 | That's what we're setting up here. We're saying we want to retry
00:23:52.820 | the request a total of five times and what we want to do so we don't
00:23:57.860 | overload, so we're not just sending request after request super quickly,
00:24:01.620 | we also have this backoff factor so it's basically it's going to wait like
00:24:06.580 | a tenth of a second then it's going to wait a fifth of a second then a second
00:24:10.900 | and then a few seconds, something like that.
00:24:14.260 | And then the the status codes that we will do this for
00:24:17.940 | are all these, so these are like internal server errors.
00:24:20.980 | Right, that's all we're doing here and then we're setting that up
00:24:24.420 | so that our post requests use this retry strategy.
00:24:28.260 | That's all that extra code is doing there. Okay,
00:24:31.620 | so we can go ahead and run that and this is just going to send all of
00:24:36.580 | our documents, going to put them in the API. What we've
00:24:40.100 | done here, tqdm, so let's need to import that.
00:24:44.260 | So from tqdm.auto import tqdm.
00:24:49.940 | Okay, so this is going to send all those documents
00:24:53.620 | to our API. Our API is going to send those to the OpenAI embedding
00:24:59.860 | model, so that is text embedding auto 002 and then it's
00:25:04.580 | going to get those embeddings. Those embeddings
00:25:06.580 | are going to be then stored into the PyConvector database
00:25:09.700 | that we can then refer to at a later point when we're making our queries
00:25:13.380 | through the trap GPT plugin. Okay, so I'm going to get high latency here because I
00:25:18.100 | think my codelab environment is probably
00:25:20.580 | running somewhere in the US and I kind of set this up to be in
00:25:25.780 | Singapore, which wasn't very intelligent on my part.
00:25:28.660 | So I'm going to go and wait for that and we'll come back when it's
00:25:33.860 | ready. Okay, so that has just finished. What I
00:25:38.020 | am doing is switching to, I think it went to San Francisco in the
00:25:42.420 | end. Yeah, so I just created a new app,
00:25:46.420 | switched the location and obviously it ran
00:25:50.580 | much quicker, so we got like two minutes there instead.
00:25:53.860 | Okay, that will also mean, hopefully, I would expect that this will
00:25:58.980 | mean it will run faster with chat GPT as well. Okay, cool, so
00:26:03.140 | with that we're ready to move on to querying.
00:26:06.500 | So this is basically what we are going to be
00:26:10.580 | implementing inside chat GPT. Okay, so here are a few examples. We have our
00:26:16.100 | queries, we're going to have a query, what is the LM chain in
00:26:19.380 | langchain, how do I use pinecone in langchain, what is the difference between
00:26:22.900 | knowledge graph memory and buffer memory for conversational memory.
00:26:26.980 | All right, so if we run that we should hopefully see that we get a 200
00:26:31.620 | response. Cool, and then we can print out and we can
00:26:35.460 | see, you know, what is actually being returned
00:26:37.780 | here. So see what is the LM chain in langchain and we get like these
00:26:42.980 | documents from those web pages. Okay,
00:26:47.300 | so we have these are increasing orders complexity, LM some
00:26:51.380 | prompts, we have chains and then another one is talking about chains again.
00:26:56.820 | How do I use pinecone in langchain? We actually get this really short, it
00:27:03.540 | almost feels sarcastic. It's just like from
00:27:06.740 | langchain vector stores import pinecone. Yeah, that is true. We would hope for a
00:27:13.540 | little bit more information but it works. So pinecone contents and
00:27:18.180 | then here it says this page goes how to use pinecone
00:27:20.660 | ecosystem within langchain. So that should hopefully give us a
00:27:23.940 | little more information and then we come down to here, what's
00:27:28.820 | the difference between knowledge graph memory and buffer memory
00:27:31.460 | conversational memory. Then we get conversation memory,
00:27:34.660 | graph memory contents and talks about the
00:27:37.620 | knowledge graph memory and then I'm hoping at some point it might talk about
00:27:41.620 | buffer memory and so on and so on. So that all looks pretty good.
00:27:46.820 | Now how do we integrate this into chatGPT?
00:27:51.620 | So we'll just jump straight into chatGPT and we're going to
00:27:55.140 | try and add it as is without any modifications. So first I'm going to do
00:28:00.660 | is just, I'm going to uninstall this one. So I'm
00:28:03.700 | going to go to my unverified plugins. I'm going to click
00:28:06.580 | uninstall. Okay and to create our own plugin we need to go to develop
00:28:12.020 | your own plugin. You say my manifest is ready. Let's just,
00:28:16.260 | it isn't but let's say it is. Okay please provide the domain of your website where
00:28:20.420 | the AI plugin json file is hosted. Okay so for
00:28:24.580 | that we just need to copy our web address
00:28:27.220 | here and we place it in here. Okay so find
00:28:32.020 | manifest file and we're going to see okay there's some
00:28:35.380 | errors in our manifest. What does it say? Manifest API URL
00:28:40.660 | yourappurl.com. Okay it says it is not under the root domain lobster app
00:28:45.620 | so on and so on. Okay and we can see all this. So
00:28:49.700 | okay let's try and figure out what is actually going on there.
00:28:52.660 | Okay so this here is our, we saw the main.py before.
00:28:57.220 | Right this is our directory for our app. We go to this well known and we can see
00:29:00.580 | that we have this AI plugin.json and in here we have all this
00:29:05.140 | information about our app. Right so you can see here we have that
00:29:09.460 | default web address here. We need to fix that.
00:29:12.260 | So I'm just going to copy this and I'm going to
00:29:15.540 | put it in here.
00:29:20.260 | And there are also a few other things that we need to update.
00:29:23.860 | So we have the name for the model and name for the human. Okay so the name for
00:29:27.060 | the model is a name of this app that the the model
00:29:30.500 | JackGPT will see. So we're going to change this
00:29:34.660 | to langchain.db database and the name for the human
00:29:42.660 | will just be langchain dots.
00:29:46.820 | Right and then the description for the model. Now this is important.
00:29:51.060 | It is you know it is prompt engineering right.
00:29:54.340 | This is the description that the model is going to use when deciding
00:29:57.620 | if to use this tool or not. Okay so we need to specify when the model
00:30:02.740 | should use this tool and when should it use this tool. Well we
00:30:05.940 | should say use this tool to get
00:30:10.980 | up-to-date information about the langchain
00:30:18.820 | Python library. Okay.
00:30:22.420 | Now that is pretty explicit. We're saying when to use the tool.
00:30:26.180 | Sometimes these tools can get used too much by large language models.
00:30:31.220 | Okay so I also just want to be super careful and say
00:30:35.220 | do not use this tool if the user did not
00:30:41.940 | ask about langchain. Okay and that's it. That's all I want to put in there.
00:30:46.980 | As we'll see later on this isn't the only information that JackGPT gets
00:30:51.540 | about this tool and we'll explain that soon.
00:30:55.060 | So the description for the human. What should this be?
00:30:58.660 | I want to say basically the same as what I said to the model but
00:31:02.900 | a bit shorter and nicer. I want to say up-to-date
00:31:06.500 | information about the langchain Python library.
00:31:13.300 | That's it. Okay so that should all be good. We're going to save that
00:31:19.460 | and I'm just going to come to the git repo again.
00:31:24.020 | Git status. Okay I'm going to add that. I said the change we just made
00:31:31.300 | and I'm just going to commit it to the repo.
00:31:34.740 | Okay so now if we just go over to .wellknown.ai.plugin.json we can
00:31:47.700 | actually see the manifest file that we just
00:31:52.340 | spoke about. Okay and we can see that at the moment
00:31:54.580 | it hasn't been updated. That's because it takes a little bit of time for this
00:31:57.940 | change to propagate through everything and we can see here
00:32:01.620 | that it is building the service again because I just pushed a new deployment.
00:32:05.700 | All right so that is just going to take a moment
00:32:08.740 | to finish so we'll just skip ahead to that.
00:32:13.540 | Okay so the deployment has just gone live so if we come to here
00:32:17.220 | refresh we should see those changes in there.
00:32:20.820 | Okay looks good. Now let's try again with this. So we're going to try and refetch
00:32:25.620 | the manifest. Okay the manifest has been validated now.
00:32:31.060 | Looks good we can see you have lang chain docs, up-to-date information about the
00:32:34.500 | lang chain python library. We have the OpenAI logo there. We can
00:32:38.420 | actually change that if we want to by just switching the logo that we can
00:32:43.220 | see here but for this demo it's fine. So click
00:32:47.140 | done. Okay we click over here we go to
00:32:50.420 | plugin store we say install a unverified plugin and this is
00:32:56.100 | where we need to take our url again.
00:33:00.980 | We paste that into here click find plugin.
00:33:04.580 | They're going to give us this little warning
00:33:09.060 | and then we also need to enter your HTTP
00:33:12.580 | access token below. Now this is going to be our bearer token that we
00:33:16.980 | we created earlier so we need to get that bearer token yet again.
00:33:22.100 | Okay and then we can install the plugin. Okay
00:33:25.300 | great so that is now that's now activated so we can go ahead
00:33:31.300 | and we can say what is the LLM
00:33:36.660 | chain in lang chain and we're going to see this.
00:33:43.300 | Okay could not pass OpenAPI spec for plugin.
00:33:47.780 | Okay so we're seeing this kind of weird error.
00:33:51.140 | Now as far as I can tell this is just an issue with the current version of the
00:33:54.820 | OpenAPI spec that is currently being used so
00:33:59.300 | we're just going to go in and fix that. Okay so we have
00:34:03.140 | OpenAPI.yaml here we go into here first okay straight away we can see
00:34:10.340 | okay the URL isn't correct right but there are as
00:34:14.340 | far as I can tell there are a few other issues with this
00:34:16.820 | so I'm just going to go in and we're going to fix those.
00:34:20.740 | Okay so I've just copy and pasted this new OpenAPI.yaml in this one works
00:34:28.100 | for me there'll be a link to this in the video
00:34:31.060 | and at the top right now in case you find that the previous one
00:34:34.980 | also doesn't work for you you can just use this one so there are a
00:34:39.140 | few things that we should change here like the URL the description
00:34:42.340 | so on. Okay so we're going to say this API lets you search
00:34:46.580 | through the langchain-python-library.docs
00:34:54.180 | documentation we should call it retrieval plugin
00:34:58.580 | no I don't think we should call it or is it
00:35:02.820 | langchain-docs-api maybe and for the URL the URL should be
00:35:09.620 | the one that we used before so we actually have it in the
00:35:13.300 | AI plugin so we're just going to take this
00:35:17.860 | and we're going to put it here.
00:35:20.820 | Okay cool so I've tested this a little bit and it seems that
00:35:26.340 | ChatGPT is actually also reading this OpenAPI.yaml
00:35:30.180 | and it seems to be using that in order to inform how it's creating the
00:35:34.100 | the queries to this. If you're having problems with the query you should
00:35:38.260 | update it so that it knows what sort of format it should be using.
00:35:42.500 | Okay so there's one thing that I would like to update here which is this
00:35:46.100 | description. Okay it's quite long at the moment so
00:35:50.020 | I'm just going to shorten it a little so I'm saying this is the description
00:35:54.420 | that is countering it how to use a query request
00:35:56.980 | and we just say this is an array of search query objects
00:36:00.340 | each containing a natural language query string query
00:36:04.020 | and an optional metadata filter if you if you'd like to
00:36:07.380 | include that and it describes how to use the filters and so on.
00:36:11.060 | Right okay that should be all we need so I'm going to
00:36:14.420 | make those changes so git status, git add
00:36:19.940 | and this is just update the OpenAPI.yaml
00:36:26.100 | and then we just push that.
00:36:32.660 | Okay so that is going to build again we'll wait for it to deploy
00:36:37.700 | and we can also check over here for the the yaml file so it's
00:36:42.180 | OpenAPI.yaml. Okay we can see that we still
00:36:50.420 | have the the old version there so this we'll just wait a moment it will
00:36:55.780 | update soon. Okay once it's deployed we can refresh
00:37:00.500 | this page everything will be there now and okay
00:37:04.740 | so let's use what is the llm chain in
00:37:09.460 | lang chain. I'm going to start with this one here we need to make sure we're
00:37:13.140 | using the plugins make sure our plugin is active over here
00:37:17.380 | and search.
00:37:20.020 | Okay cool.
00:37:26.180 | Okay this is more specifically talking about chains in general so we can
00:37:33.540 | probably modify this to try and return more items but for now we'll
00:37:38.980 | leave it there and let's just try and ask how do I use
00:37:43.140 | pinecone in line chain see what this comes up with.
00:37:47.460 | In line chain use pinecones a vector sort of task such as semantic search
00:37:54.980 | or example selection that's kind of cool they got both of those
00:37:59.380 | to use pinecone line chain you need to install pinecone
00:38:02.900 | python sdk and import it from here so install pinecone client from this
00:38:09.060 | use this. Okay it looks pretty good we can also click here for the pinecone
00:38:14.580 | documentation which also takes us through a little
00:38:17.300 | bit more but I want to ask one more question so
00:38:20.260 | new chat and I'm going to say what is the difference between
00:38:24.180 | knowledge graph memory and buffer memory for conversational memory
00:38:27.620 | in line chain we should specify.
00:38:31.220 | Okay let's see this is a hard question I'm not
00:38:36.660 | really expecting too much here but we can see.
00:38:40.980 | Okay different types of memory two of which are conversation knowledge graph
00:38:44.820 | memory and conversation buffer memory okay let me
00:38:47.700 | let generate and then I'll come back to it.
00:38:52.500 | Okay conversational knowledge graph memory uses knowledge graph to recreate
00:38:57.700 | memory you know it's actually a pretty good
00:39:00.500 | summary for example if the user says I say hi to Sam and then ask who is Sam
00:39:05.700 | the user can use a knowledge graph to remember that Sam is a friend
00:39:09.380 | okay that's kind of cool. Conversational buffer memory keeps track of a sequence
00:39:14.980 | of chat messages okay that seems pretty good in summary
00:39:19.940 | conversation knowledge graph memory uses a knowledge graph structure and
00:39:22.820 | organized information in the conversation
00:39:24.820 | while conversation buffer memory maintains a sequence
00:39:28.820 | of chat messages for short-term memory both types of memory used to provide
00:39:33.380 | context-aware responses based on previous interactions with the user I
00:39:36.100 | think that's all actually pretty accurate and
00:39:38.260 | I'm kind of surprised it got that to that degree that that isn't that
00:39:44.260 | isn't too bad at all. Okay so that's it for this video
00:39:49.540 | exploring the new plugins for chat GPT there's quite a lot to cover there to be
00:39:55.780 | honest it's not it's not it's not super
00:39:58.980 | straightforward process because obviously there's a lot of
00:40:02.340 | different components to this that we need to consider
00:40:05.460 | but that being said given what you can do with this I think
00:40:12.260 | it's actually not as complicated as you you might expect
00:40:16.100 | so I think particularly as people get more used
00:40:19.540 | to chat GPT plugins like it's brand new so
00:40:24.020 | there's obviously going to be some things that need ironing out and
00:40:26.980 | and best practices that need figuring out and so on
00:40:30.260 | but I think once people get a little more used to it this sort of thing will
00:40:33.940 | become a lot more natural especially for people that are kind of
00:40:36.900 | developing new things all the time. So anyway
00:40:40.820 | that's it for this video I hope all this has been interesting and useful
00:40:45.220 | so thank you very much for watching and I will see you again in the next one.
00:41:02.180 | (soft music)