Back to Index

NEW RAG Framework: Canopy


Chapters

0:0 Canopy RAG Framework
1:31 Canopy Setup
4:23 Data Input to Canopy
7:4 Upserting with Canopy
8:24 Chatting with Canopy CLI
10:21 RAG vs. LLM only
12:24 Handling Complex Queries in RAG

Transcript

Today, we're going to be exploring the new Canopy framework. This is a framework that has been developed by the GenAI team at Pinecone. And the idea is essentially to help us build better RAG pipelines without needing to get into all of the details of how to build a RAG pipeline.

Because it's very easy to just build a very simple RAG pipeline, but it's very hard to build a good one. And it also comes with a lot of nice little features. One that I really like is the ability to just chat within the terminal and see the difference between a RAG output and a non-RAG output, so that you can very quickly evaluate how well your RAG pipeline is performing.

Now, all of this has been wrapped up into a very easy to use framework. So let's jump into it and see how we can use it. Okay, so we can see the GitHub repo here. And yeah, just a short description. And we can kind of come to this visual here.

It gives us sort of a rough idea of kind of what is going on there. And if we come down to here, we can see the different components of Canopy. I'm really gonna be focusing on the Canopy CLI down here, just to show you how to get started with it.

So we're mostly gonna be using everything through CLI. Okay, and if we come down to the setup here, we have, okay, you can create a virtual environment, you can go ahead and do that, it's fine. I'm not going to in this case. But then what we do want is we want to install the package.

So I'm actually just gonna copy this. And I'm gonna come over to my terminal window here. All right, so I'm gonna pip install. I'm just gonna add a upgrade flag here. And yes, I will let that install. I've already installed it, so yeah. Once it has installed, we should be able to just run Canopy.

And we'll get this error message to begin with. And that's because we haven't set a few environment variables but we do know from this that it isn't solved. So to deal with this, we need to set some environment variables. So we have PyCon API key, PyCon environment. There's also the OpenAI API key as well that we should add into there.

So I'm gonna go ahead and do that. I'm gonna run vim. I'm just gonna add all of these into some environment variables file. Okay, so I'm going to do on Mac. So I'm gonna do export PyCon API key. I'm gonna put my API key in there. I'm gonna do export PyCon environment.

And also put that in there. And then I'm gonna do export OpenAI API key. Put that in there. So for the PyCon API key and environment, we go to app.pycon.io. We go to API keys and I'm just gonna copy this and I'm gonna take note of my environment as well.

So us-west1-gcp and come back over here and I'm just gonna put it into this here. So you can try and seal my API keys if you like. And for the OpenAI API key, you want to go to platform.openai.com. We go to API keys at the top here. And I already created one, but I'm gonna create a new one.

So canopy-demo2, I create my secret key. And again, I'm just gonna go put it in here. Great, so put those in. And now I can just go ahead and do that. Now with that done, let's try and run Canopy again. And we should get something that looks like this.

Now what we can do is create a new index. Now to create a new index, you'd run canopy-new and then you'd have your index name. I'm gonna call mine canopy-101, but I already actually created canopy-101. So I'm just gonna call it 101A for now. Okay, so I confirm. Okay, and then from there, what we do want to do is actually add our data to this index.

Now, let me jump across to a notebook and I'll show you how we can create data in the correct format for Canopy. Okay, so we're gonna work through this notebook very quickly. There'll be a link to this notebook at the top of the video right now. So we're gonna take this data set that I scraped from archive.

It's just a load of AI archive papers. I've used either this version or the chunk version, this example a few times in recent videos. But if we just take a quick look at what is in there, we see that we basically have, okay, there's this understanding HTML with large language models, as a summary, and then we have the content.

The content is kind of the bit we care most about. Now, the content in there is fairly long and typically what we do to handle that is we have to chunk it up into smaller parts. So let me just take the length of that. Okay, so yes, quite a few characters there.

That wouldn't all fit into the context window of a LM, or it may fit in, but the whole 400 archive papers definitely wouldn't. And when we are feeding knowledge into an LM, we also want to be feeding that knowledge into smaller chunks so that we're not filling that context window so that we don't run into LM recall issues.

So to avoid that, yeah, we use chunking. And fortunately, that's kind of built in to Kanopy. So we don't even need to care about it. It's gonna be done automatically. All we need to do is set up this data format here. So we have ID, text, source, so where the source is coming from.

You don't have to pass that. You can just leave it blank, it's fine. And then metadata, which is just a dictionary containing any relevant information that you may or may not want in and like attached to your vectors. Again, you don't need to put anything in here. Okay, so we run this.

This is just transforming our Wackyface dataset into this format. Okay, and removing the columns that we don't want. Then what I'm going to do is convert this into a JSON lines file. Okay, and then should be able to take a look at that over here. And yeah, we can see all of this.

Okay, so with that done, we can move on to actually putting all of this into our index using Kanopy. Okay, once we have our dataset, we can go ahead and run Kanopy. Upsert, and it would be in here. So this is where I saved my data in the same direction I'm in now.

And actually, you know, we can just see that quickly. Yeah, okay, so I'm going to upsert this. So Kanopy upsert, there we go. Now, when we try and do that, we're actually gonna get this error. And that's because we also need a index name environment variable. So we'll go ahead and do that as well.

You can also set index name here within the command, but I'm going to do it via this. Okay, and I want A to start with. And do the upsert. It'll ask us to confirm that everything was correct. So just, you know, quick check, it looks pretty good. Say yes, and we continue.

And then, yeah, we're gonna get this loading bar. It's gonna just show us the progress of our upsert. But I've already created my index, doing this exact same process. So I'm gonna actually cancel that. And what I'm going to do is change my index name to that other index.

And then I'm going to start Kanopy. Okay, so I'm gonna do Kanopy start. And what this is going to do is start up the API or Kanopy server, okay? So from here, I can actually, you know, I could go to my localhost 8000 and go to the docs, and I can see, if I zoom in a little bit, see it has some documentation.

We have all the endpoints and stuff in here that we can use. Now, I actually want to use the CLI. Now, the CLI requires that you have the Kanopy server running in the background. So I'm gonna switch across to a new terminal window. I'm going to activate my ML environment.

I'm going to run source Mac Env, and I'm going to export my index name. Then what I want to do is run Kanopy chat. And so you can run Kanopy chat without any arguments, and that will, you know, it's like you're chatting with your LLM, and it's doing rag in the background, and you're getting your responses.

But I also actually want to do it with NoRag. What NoRag will do is show us a comparison of the LLM response with and without rag. So this is incredibly useful for just evaluating what rag is actually doing for you. So yeah, let's see. Let's take a look at this, and yeah, we should see some pretty interesting results.

Okay, cool. So we get a nice little note up there. This is debugging tool, not to be used for production. That's cool, 'cause we're just testing it. So hello there. So with that, press escape and enter. I'll send my query with context rag. Okay, so we see with this query, we literally get the same response, because, you know, it doesn't really matter whether we're using rag or not for general chat.

But what if we have something like an actual query that is relevant to the dataset that we put behind this? So our dataset contains information about LLAMA2, the large language model, because this is an archived dataset on, like, AI. So I can ask it something like that. I can ask it, "Can you tell me about LLAMA2?" So obviously, with context, LLAMA2 is a collection of pre-trained and fine-tuned large language models, ranging in scale from 770 billion parameters, so on and so on, right?

That's cool. Then, no rag, I apologize, but I'm not aware of any specific entity called LLAMA2. Okay, so this LLAM, it just doesn't know anything about LLAMA2, because its training day cutoff was, like, September 2021. So, yeah, it cannot know about that. So, I don't know, let's continue the conversation.

Like, okay, fascinating. Can you tell me more about when I might want to use LLAMA? Okay, let's see what we get. Okay, cool. So, with context, rag, we have LLAMA2, specifically the fine-tuned LLAMs, optimized for dialogue use cases, found to outperform open-source trap models on most benchmarks that were tested, so on and so on.

Okay, so, also gives us a source document, which is pretty nice. Now, without a context, okay, LLAMAs can serve various purposes and be useful in different situations. Can use them as pack animals, therapy animals, guard animals, apparently, I didn't know that. And, okay, maybe, and in sustainable agriculture. So, obviously, one of those answers is a little bit better than the other, at least for our use case.

Now, let's ask you a slightly more complicated question. So, can you tell me about LLAMA2 versus Distilbert? Now, this is the sort of question where a typical rag pipeline, if not built well, will probably struggle, because there's actually kind of two search queries in here. We want to be searching for LLAMA2, and we also want to be searching for Distilbert, which appear in different papers.

But, typically, the way that rag would be implemented, at least, you know, your first versions and whatever else, that's probably gonna get passed to your vector database as a single query. The good thing about Canopy is that it will handle this, and it'll actually split this up into multiple queries, so we're doing multiple searches, getting results from the Distilbert paper and the LLAMA2 paper, and then it's gonna provide us, hopefully, with a good comparison between the two.

All right, so LLAMA2 is a collection of pre-trained and fine-tuned lost-language models, so on and so on. Cool. Distilbert is a smaller, faster, and lighter version of the BERT language model. In summary, LLAMA2 is specifically summarized for dialogue use cases, whilst Distilbert is a more efficient version of the BERT model that can be used for various natural language processing tasks.

Okay, I think, you know, so it's a good comparison. Without context, obviously, it doesn't know what LLAMA2 is. So, yeah, it's like, okay, it's not a known entity or term in the realm of NLP or AI. However, Distilbert refers to a specific model architecture used for various NLP tasks.

So, actually, it can tell us a little bit about Distilbert, because this is an owner model, so it does know about that, but it can't give us a good comparison. So, that's a very quick introduction to the Canopy framework. I think, from this, you can very clearly see what the pros of using something like this are.

Of course, this is just the CLI. There's also the Canopy server and the actual framework itself, which you can obviously go ahead and try out. But for now, that's it for this video. Hope all this has been useful. So, thank you very much for watching, and I will see you again in the next one.

Bye. (gentle music) (gentle music) (gentle music) (gentle music) (soft music) you