back to indexNEW Pinecone Assistant
Chapters
0:0 AI Assistants
0:41 Pinecone Assistants in Python
1:19 Building an AI Research Assistant
2:11 Assistant Message and Chat
3:5 Adding Files to the Assistant
5:30 Chatting with our Assistant
7:23 Assistant Chat History
10:47 Asking about Mamba 2
12:11 Wrapping up with Assistants
00:00:04.960 |
Now, Pinecone Assistance allows us to build AI assistance 00:00:21.800 |
and can also answer questions about knowledge 00:00:25.520 |
specific to our own use cases or our own organizations, 00:00:29.960 |
by simply providing them with the source of that knowledge 00:00:37.800 |
at Pinecone Assistance and how we can use them in Python. 00:00:41.240 |
Now, we're gonna be working through this notebook here. 00:00:43.640 |
There will be a link to this in the comments below. 00:00:59.520 |
I'm also going to be using this Pinecone Notebooks, 00:01:04.840 |
or get my Pinecone API key within a notebook, 00:01:08.360 |
So I've already ran it, so it's not gonna do it again. 00:01:11.380 |
But basically, API key is in Pinecone API key 00:01:15.640 |
So now I can just initialize my client as per usual. 00:01:42.280 |
It's optional, so I can actually just remove it. 00:01:56.200 |
Cool, and we can see that it has been created. 00:02:03.520 |
So I can pass in name my assistant to describe assistant. 00:02:18.760 |
because we need to provide it with some knowledge 00:02:22.760 |
But I do want to just go over what we are doing here. 00:02:26.040 |
Okay, so we also have this new message object. 00:02:33.320 |
and allows us to specify whether it is us talking, 00:02:36.240 |
i.e. the user, or whether it is the assistant talking. 00:02:46.160 |
And I'm going to hit the chat completions method here 00:03:02.120 |
So we need to add some files to our assistant 00:03:19.640 |
And basically within this repo, just loads PDF files. 00:03:37.600 |
Cool, so we have, I don't remember how many we have here. 00:03:48.160 |
So I'm going to upload all those to our assistant. 00:03:56.880 |
and then it's going to send it over to Pinecone 00:04:10.200 |
and then like get a return, get a response from Pinecone. 00:04:19.720 |
Or what I am going to do is do use minus one, 00:04:28.240 |
because I just want to like send as many PDFs as I can. 00:04:52.160 |
Then of course, Pinecone has started processing the document 00:04:54.840 |
and then we returned the status of that document immediately 00:05:01.160 |
we're going to see that all of these are processing. 00:05:11.120 |
So we just call describe file and we pass the file ID. 00:05:15.000 |
And now we can see that at least this first document 00:05:19.040 |
And I'm going to run this little for loop here 00:05:40.200 |
because the assistant will reply in Markdown. 00:06:00.560 |
about the Mixture model and see what it says. 00:06:12.720 |
And well, I mean, kind of the whole point of, 00:06:15.920 |
or one of the main points of Pinecone assistance 00:06:20.040 |
is that everything is grounded in like actual knowledge. 00:06:36.840 |
And it gives us all this information about it, 00:06:44.160 |
So we can see here that we have reference one. 00:06:54.920 |
like we've used in order to get that information. 00:06:59.640 |
Okay, so to basically construct this paragraph here, 00:07:17.120 |
and it brings us through to just seeing the PDF here, 00:07:22.640 |
And then obviously we can refer to our citations 00:07:28.280 |
in what the assistant is telling us, which is nice. 00:07:34.920 |
But now I want to actually chat with the assistant. 00:07:39.200 |
that will allow us to do that a bit more easily. 00:07:41.560 |
So first thing we need is a list of our chat history. 00:07:49.080 |
with the first message that I sent asking about Mixture 00:07:52.440 |
and the response from our assistant, which is here, okay? 00:08:00.080 |
at the output there so that you can see what I'm doing 00:08:10.240 |
and then we also have the role, which is assistant. 00:08:15.080 |
and creating a message object using those two values. 00:08:19.120 |
And then I'm going to create this chat function, 00:08:21.440 |
which is just gonna consume a message right from me. 00:08:27.320 |
It's going to format my input into a message object. 00:08:32.320 |
We are going to get the response from our assistant. 00:08:42.440 |
And then I'm going to add both my initial message. 00:08:50.120 |
and the message or response from the assistant 00:08:54.600 |
So we're going to be adding to the chat history over time. 00:08:57.280 |
And then I'm gonna return the markdown formatted response 00:09:01.800 |
so that we can actually see what it is saying. 00:09:06.280 |
So the first one is I'm going to ask a little bit more 00:09:09.480 |
about what is a sparse mixture of experts model. 00:09:20.920 |
So we have this sparse mixture of experts model, 00:09:29.160 |
And we can actually see that the reference here is different. 00:09:33.600 |
It's not actually coming from the same paper. 00:09:35.240 |
It's coming from another paper that we have in there. 00:09:39.400 |
And we see, okay, this paper is literally talking about 00:09:43.840 |
or to some degree about SMOE, which is pretty cool. 00:09:53.800 |
where it's talking about basically the drawbacks of SMOE. 00:09:59.880 |
we'll see that this is being pulled in as well, 00:10:05.120 |
it's showing us all like the most important information 00:10:07.920 |
or in my opinion, some of the most interesting information. 00:10:11.720 |
So, okay, that's cool, but I have no idea what this means. 00:10:25.440 |
Okay, so we're pulling from the same paper again. 00:10:28.040 |
And it said, okay, detrimental several reasons. 00:10:34.280 |
suboptimal performance, inefficiency in learning 00:10:43.600 |
We learned about mixture and SMOE a little bit. 00:10:52.520 |
I, you know, let's say I don't have a clue what Mamba2 is. 00:10:57.520 |
I just want like a really nice little overview 00:11:06.200 |
Cool, so Mamba2 is a type of deep learning model 00:11:16.680 |
It builds on top of the original Mamba model. 00:11:18.720 |
And it helps to process sequences more efficiently 00:11:27.640 |
And we can see, okay, we've got reference one here, 00:11:30.040 |
but in this output, we actually have two references, 00:11:33.680 |
or at least two different documents that it's pulling from. 00:11:37.080 |
And yeah, we can go ahead and have a look at both of those. 00:11:52.720 |
And let's have a look at what the other one is. 00:11:57.320 |
Okay, so it's pulling information from both of those 00:12:07.440 |
for just keeping relatively up to date with what is going on. 00:12:14.200 |
continue talking to your assistant for as long as you like, 00:12:29.880 |
And you can see down here, I have this storage. 00:12:35.120 |
So I'm going to just go ahead and delete my assistant, 00:12:46.160 |
And yeah, with that, we are done with this walkthrough. 00:13:00.400 |
that is able to ground its answers in knowledge 00:13:07.800 |
for providing more trustworthy outputs from our assistant. 00:13:16.080 |
I hope all of this has been useful and interesting,