back to indexNEW RAG Framework: Canopy
Chapters
0:0 Canopy RAG Framework
1:31 Canopy Setup
4:23 Data Input to Canopy
7:4 Upserting with Canopy
8:24 Chatting with Canopy CLI
10:21 RAG vs. LLM only
12:24 Handling Complex Queries in RAG
00:00:00.000 |
Today, we're going to be exploring the new Canopy framework. 00:00:12.580 |
without needing to get into all of the details 00:00:22.320 |
And it also comes with a lot of nice little features. 00:00:43.620 |
So let's jump into it and see how we can use it. 00:01:03.860 |
we can see the different components of Canopy. 00:01:07.540 |
I'm really gonna be focusing on the Canopy CLI down here, 00:01:16.100 |
So we're mostly gonna be using everything through CLI. 00:01:23.300 |
we have, okay, you can create a virtual environment, 00:01:30.380 |
But then what we do want is we want to install the package. 00:01:35.020 |
And I'm gonna come over to my terminal window here. 00:01:54.860 |
And we'll get this error message to begin with. 00:01:57.740 |
And that's because we haven't set a few environment variables 00:02:00.700 |
but we do know from this that it isn't solved. 00:02:52.900 |
We go to API keys and I'm just gonna copy this 00:02:55.820 |
and I'm gonna take note of my environment as well. 00:03:04.700 |
So you can try and seal my API keys if you like. 00:03:21.420 |
And I already created one, but I'm gonna create a new one. 00:03:47.220 |
Now with that done, let's try and run Canopy again. 00:03:51.580 |
And we should get something that looks like this. 00:03:57.580 |
Now to create a new index, you'd run canopy-new 00:04:16.660 |
Okay, and then from there, what we do want to do 00:04:29.740 |
Okay, so we're gonna work through this notebook 00:04:44.180 |
I've used either this version or the chunk version, 00:04:51.460 |
But if we just take a quick look at what is in there, 00:04:56.380 |
there's this understanding HTML with large language models, 00:05:02.300 |
The content is kind of the bit we care most about. 00:05:11.500 |
is we have to chunk it up into smaller parts. 00:05:24.060 |
That wouldn't all fit into the context window of a LM, 00:05:29.580 |
but the whole 400 archive papers definitely wouldn't. 00:05:34.020 |
And when we are feeding knowledge into an LM, 00:05:42.340 |
so that we're not filling that context window 00:05:53.060 |
And fortunately, that's kind of built in to Kanopy. 00:06:02.180 |
All we need to do is set up this data format here. 00:06:13.700 |
And then metadata, which is just a dictionary 00:06:22.420 |
Again, you don't need to put anything in here. 00:06:28.100 |
This is just transforming our Wackyface dataset 00:06:35.020 |
Okay, and removing the columns that we don't want. 00:06:57.420 |
we can move on to actually putting all of this 00:07:19.160 |
And actually, you know, we can just see that quickly. 00:07:39.300 |
You can also set index name here within the command, 00:07:56.220 |
It'll ask us to confirm that everything was correct. 00:07:58.380 |
So just, you know, quick check, it looks pretty good. 00:08:05.900 |
And then, yeah, we're gonna get this loading bar. 00:08:08.020 |
It's gonna just show us the progress of our upsert. 00:08:19.700 |
And what I'm going to do is change my index name 00:08:38.080 |
I could go to my localhost 8000 and go to the docs, 00:08:53.800 |
Now, the CLI requires that you have the Kanopy server 00:08:58.640 |
So I'm gonna switch across to a new terminal window. 00:09:14.600 |
And so you can run Kanopy chat without any arguments, 00:09:26.980 |
But I also actually want to do it with NoRag. 00:09:36.440 |
So this is incredibly useful for just evaluating 00:09:46.120 |
and yeah, we should see some pretty interesting results. 00:09:53.520 |
This is debugging tool, not to be used for production. 00:10:15.800 |
whether we're using rag or not for general chat. 00:10:20.300 |
But what if we have something like an actual query 00:10:24.960 |
that is relevant to the dataset that we put behind this? 00:10:33.360 |
because this is an archived dataset on, like, AI. 00:10:40.600 |
I can ask it, "Can you tell me about LLAMA2?" 00:10:54.560 |
ranging in scale from 770 billion parameters, 00:11:03.840 |
but I'm not aware of any specific entity called LLAMA2. 00:11:11.840 |
because its training day cutoff was, like, September 2021. 00:11:20.240 |
So, I don't know, let's continue the conversation. 00:11:27.400 |
Can you tell me more about when I might want to use LLAMA? 00:11:48.240 |
on most benchmarks that were tested, so on and so on. 00:12:04.360 |
Can use them as pack animals, therapy animals, 00:12:06.760 |
guard animals, apparently, I didn't know that. 00:12:10.440 |
And, okay, maybe, and in sustainable agriculture. 00:12:24.240 |
Now, let's ask you a slightly more complicated question. 00:12:27.500 |
So, can you tell me about LLAMA2 versus Distilbert? 00:12:37.800 |
where a typical rag pipeline, if not built well, 00:12:42.080 |
because there's actually kind of two search queries in here. 00:12:47.760 |
and we also want to be searching for Distilbert, 00:12:52.040 |
But, typically, the way that rag would be implemented, 00:12:55.880 |
at least, you know, your first versions and whatever else, 00:12:59.640 |
that's probably gonna get passed to your vector database 00:13:06.520 |
The good thing about Canopy is that it will handle this, 00:13:08.920 |
and it'll actually split this up into multiple queries, 00:13:22.060 |
All right, so LLAMA2 is a collection of pre-trained 00:13:24.640 |
and fine-tuned lost-language models, so on and so on. 00:13:29.120 |
Distilbert is a smaller, faster, and lighter version 00:13:33.000 |
In summary, LLAMA2 is specifically summarized 00:13:37.080 |
whilst Distilbert is a more efficient version 00:13:40.080 |
for various natural language processing tasks. 00:13:42.960 |
Okay, I think, you know, so it's a good comparison. 00:13:52.480 |
So, yeah, it's like, okay, it's not a known entity or term 00:13:59.120 |
However, Distilbert refers to a specific model architecture 00:14:04.320 |
So, actually, it can tell us a little bit about Distilbert, 00:14:22.400 |
what the pros of using something like this are. 00:14:34.080 |
which you can obviously go ahead and try out.