back to indexFIRST Look at Pinecone Serverless!
00:00:18.040 |
for the vast majority of use cases on Pinecone. 00:00:21.060 |
So just to point out the cost savings of this, 00:00:25.440 |
I want to take a look at the pricing calculator. 00:00:35.560 |
and I come down to, they explain everything here. 00:00:45.880 |
so there is no longer any such thing as pods, 00:00:49.400 |
but instead, you're paying based on the amount 00:00:53.480 |
that you're storing and the amount that you're querying. 00:00:55.880 |
So you have a separation between storage and queries, 00:01:03.900 |
because that's now on storage-optimized hardware 00:01:23.480 |
like 5 million records is a lot for most raggedy use cases. 00:01:28.480 |
Honestly, I think you're probably gonna be using less. 00:01:32.160 |
But anyway, let's just leave 5 million for now. 00:01:43.680 |
but I think most of the things that I have built 00:02:05.320 |
It depends on how you're structuring everything. 00:02:08.320 |
And then namespaces, again, that's gonna depend. 00:02:15.160 |
you will probably have quite a lot of namespaces, 00:02:32.520 |
you're probably gonna be looking at like 500,000 maybe, 00:02:37.240 |
maybe a million, you know, it varies a lot, right? 00:02:49.240 |
you just have to pay for a pod, like P1 or S1, 00:02:52.320 |
and that's gonna cost $70, just every month, right? 00:03:05.440 |
If, you know, if you're doing less queries per month, 00:03:12.520 |
Now, if we decrease the number of namespaces, 00:03:16.480 |
you just have one namespace, it goes up a little bit. 00:03:19.480 |
But, you know, it's still $10 compared to the $70 00:03:23.560 |
that we would have had before, which is pretty good. 00:03:31.280 |
at how we'd actually use new Pinecone serverless 00:03:36.760 |
So I'm gonna come over to the examples of Pinecone, 00:03:41.000 |
and I'm just gonna do, we can do semantic search for now. 00:03:45.120 |
Okay, so semantic search, I will open this in Colab, 00:03:58.520 |
So you should see 300 for the Pinecone client, 00:04:17.400 |
Now, the reason that we're using this dataset, 00:04:20.200 |
Pinecone datasets, is because we already have 00:04:37.720 |
like 80,000 records there, and yeah, it's super quick. 00:04:45.800 |
We're gonna decide whether we want to use serverless 00:04:54.360 |
Okay, so with the new Python client, it supports both. 00:04:57.480 |
If you wanna use pods, you'd set that to false. 00:05:03.440 |
And then we have our API, keys, environment, variables. 00:05:07.640 |
So for serverless, we don't need the environment anymore. 00:05:25.560 |
So I'm gonna go over to my Pinecone project here. 00:05:35.640 |
So this does need to be a serverless project. 00:05:39.160 |
Right now, with serverless, there is not a free tier, 00:05:44.160 |
as we have with the pod-based architecture in Pinecone. 00:05:54.120 |
that you can claim and just use serverless for that. 00:06:07.640 |
So that is coming, it's just not quite there yet. 00:06:24.260 |
So this is a new object we have that just defines your, 00:06:32.920 |
I am using serverless, of course, so I'm using this one. 00:06:40.200 |
that is currently supported, as far as I'm aware. 00:07:00.020 |
we need to go through, because when we list index, 00:07:05.500 |
So we just need to do this to return the indexes, 00:07:13.400 |
If you do have indexes, you can also use this, I believe. 00:07:31.040 |
The spec here is the serverless spec that you saw before. 00:07:48.740 |
So this now shows as being, having some vectors in there. 00:07:53.740 |
You should see zero if this is your first time 00:07:59.100 |
So I'm going to just upset all of my vectors. 00:08:04.500 |
Now, while that is running, let's have a look at, 00:08:07.860 |
let's have a look at how much money we'd be saving on this 00:08:15.160 |
So we have, what is it, like 80,000 vectors, I think. 00:08:26.300 |
Let's say I'm gonna get, be optimistic and say, 00:08:34.740 |
And let's say I'm gonna write another 20,000 vectors a month. 00:08:54.960 |
So that's gonna cost me a grand total of $3.69 a month, 00:09:12.680 |
Now, I'll fast forward to when our upload is complete. 00:09:17.440 |
Okay, so that has finished and we can go ahead 00:09:25.560 |
as what we've had before with the pod-based approach. 00:09:29.200 |
So we can say, which city is the highest population 00:09:34.240 |
So we're gonna just see the results that we get. 00:09:38.760 |
Okay, it says, I think I format it a little nicer here. 00:09:54.200 |
I can say, which metropolis has the highest number of people 00:09:59.760 |
And yeah, again, we get what is the biggest city 00:10:04.720 |
So yeah, semantic search, everything checks out there. 00:10:13.520 |
we just wanna save resources and just delete that index. 00:10:21.000 |
So that's a very fast introduction to PyCon serverless. 00:10:40.720 |
I hope all of this has been interesting and useful.