Steerable AI with Pinecone + Semantic Router

Today we're going to be taking a look at a new integration in the Semantic Router library, which is with Pinecone. Now naturally I'm very involved with both Semantic Router and Pinecone, so it seems logical that those two will have come together at some point and now they have. The purpose behind pulling these two technologies together is primarily that of scalability.

So with Pinecone, you obviously have huge potential scale in what you can do and that translates over to Semantic Router as well. The number of utterances and routes that you can sort in Semantic Router at the moment is still pretty high because we're sorting everything locally, but with Pinecone you can just go incredibly high scale, which is exciting for many reasons, but for me mostly to see what sort of use cases people come up with.

Now for sure I think you can easily create tons of routes and tons of utterances and get relatively high scale from that, but I'm sure there are many other use cases out there as well that I have just not even thought about yet. So that is scale. Now the other one is kind of like ease of use and persistence.

So with Pinecone, everything is within your Pinecone index, so you can then begin loading up your route layers from Pinecone, which makes moving your route layers across different places and just even from one session into another session much easier than when you're doing it locally with the local index.

So it's also very exciting. Now let's jump straight into it. So I'm going to go to the Semantic Router library. We have the docs here, we have, well, there's a few places we can learn how to do this. So I'm going to first, if you want to just like a very basic example, you can come into here.

So the indexes Pinecone example here, or the one we're going to walk through is this Pinecone and scaling example, which just has a little bit more in it and a few more routes. It's not, it isn't very high scale. I need to add more to this, but it should be able to scale pretty high.

So I'm going to open the Colab notebook and here we are. So as I mentioned in this small intro here, you could literally scale this to thousands or even millions of routes if you wanted to. And I think for a lot of use cases that probably won't be necessary, but what I have noticed is that Semantic Router can be used for a lot more than what I originally thought it could be used for.

I've seen it be used for something that we've been building is a semantic splitter for more intelligent chunking of documents and conversations. And once you start chunking conversations, you can do kind of interesting things. I've also, and then this is something I'll talk about very soon, seeing that we can also apply this splitting technique to video.

So you can chunk video frames based on what is within the video, which is pretty interesting and exciting. And another thing that we're going to cover very soon is basically content moderation for images. So you can kind of, you can imagine the route that I might be going down there, but we're going to talk about that pretty soon.

And in those use cases, it might make sense sometimes to use larger scale, and I'm sure there are many other use cases out there that I just haven't come across yet. So we're going to start here. We're going to install these. So we're using the HuggingFace datasets library, and we're also installing semantic router, the local extras, and also the Pinecone extra.

Okay. And we're going to come down to here. So I'm going to download this dataset from HuggingFace datasets, and it's just a dataset containing some routes that we're going to be using. So I think there's something like a hundred and no, 150 utterances maybe. Okay. So there's 50 routes that we have here and roughly, yes, three in each route.

So about 150 just over. So you can see here, we have one route, it's the ones that we've been seeing before. And then basically generate a few more with GPT. So there's a few routes in there. Now to generate or to take this dataset and convert those, like basically this into a route, all we need is this, it's pretty, yeah, it's pretty straightforward.

So we now have like 50 routes, they look kind of like this, and we can go ahead and, well, we need to initialize a route layer and to initialize a route layer, what we typically need is a encoder and our routes. So we can get our encoder here. We're going to use the local hugging face encoder.

And with that, we have both our encoder and routes, but we're also using a Pinecone index. So we also need to initialize that. And we will also initialize our route layer with the Pinecone index. So it's something slightly different here. And this is a new feature within the library as well.

You can also, if you want to, initialize it with the local index, but by default, it will initialize with local index. So yeah. So we do need to get a Pinecone API key. This does need to use Pinecone serverless. So I'm going to go and get that. So you need to go to app.pinecone.io.

Okay. And we're going to go to API keys and just copy that. And I'll enter in here. Cool. So you will get this warning saying the index could not be initialized. That's fine. It's because we were initializing the index without any routes being attached. It will be initialized correctly when we run this.

Okay. So because we're using Pinecone, it will take a moment for the index to be created. And then the embeddings will need to be created and sent across the Pinecone, which actually did not take long at all. Let's very quick, let me double check if we come into here and just refresh.

Cool. So we can see down here, we have our semantic router index. We can go in and it's just a 384 dimensional vectors because we're using a mini LM model with the hugging face encoder. And we only have 154 vectors here. Very small amount, but good enough for just showing you how to use it.

So with that in place, whenever we call our route layer, it's actually going to be going via Pinecone rather than the local index now. So we can run this. Okay. And we can see, okay, that triggers the chitchat route. Now, one of the benefits to Pinecone is that we have this persistent index, okay?

We have all that routes that are persisted within Pinecone. And well, what we can do is just load everything up from Pinecone, wherever we are from a new environment, and it will just work, which is pretty nice. So I'm going to go ahead and do that. I'm going to go ahead and delete the route layer, the index and the routes that we created.

So that we can't cheat and use those. And then I'm going to initialize a new Pinecone index. Now you can see I'm also passing in the index name here. By default, the index name is this, right? But I just want to show you how you can initialize it with custom index name if you prefer.

So you could put anything you want here. So I could call this like the Pinecone demo, for example. I think it would have to be like this. But I've already created it, and it's called index, so I'm going to use that again. So once I have done that and connected to my index, I can get the previous routes that are within the index already.

So I just run this, and that will get all of them. So the format that it provides them back to us in is actually more of -- it's on more of an utterance level. So we need to convert -- you can see there's two utterances here for the single chit chat route.

So we need to convert this into a format that we can use to reinitialize our route layer. And for that, we need routes. So we're going to go ahead, we're going to create a routes dictionary, loop through this and create, well, a ton of dictionary versions of these routes that we can then use to initialize a list of routes.

So I'm going to do that, and take a look at the routes dictionaries. So you can see, if we go up to the top of it, you see here there's this route, cybersecurity, and then we have the utterances. We have chit chat here, right, and so on and so on.

Cool. Jokes. We're going to use that in a moment. So we're going to come to here and transform these into a list of route objects. Okay, so I'm just iterating through the routes dictionary and pulling out the route name and the utterances and mapping them to route objects. Okay, so now we get, you know, the same again.

Now what we want to do is initialize our route layer. So again, we're just, at this point, it's basically the same as what we did before. I'm showing that it actually works. So we have our route layer, we have an encoder, the new routes that we loaded from Pinecone, and obviously our Pinecone index, and now we can test it again.

So we'll, well, you can see it already works here. I'll rerun them anyway. So yeah, you get these. It correctly identifies this as a joke, this one as a joke, and this one as chit chat. Okay. So yeah, I mean, that is it. It's very, yeah, very simple integration.

We tried to make it as simple as possible, to be honest. But as I said, it unlocks a lot of scale use cases and the just ability to persist your route layers, you know, wherever you are, which is a nice little thing to have. So yeah, that's it for this video.

I hope this has been useful and interesting. So thank you very much for watching and I will see you again in the next one. Bye. you you

Steerable AI with Pinecone + Semantic Router

Chapters

Transcript