What's next for Semantic Router (v1 update)

Today I wanted to give you a update on mainly a progress of Semantic Router or the V1 major release of Semantic Router. Now I know I've been a little quieter on YouTube recently. A lot of that is in part due to Semantic Router, this new release. In part due to a lot of other projects that are coming, but I'm not quite ready to start talking about those just yet.

But the V1 of Semantic Router plus these other projects has made me pretty busy and just unable to do all that much in terms of videos. But when I do get to release V1, that will be huge and I'll explain why in a minute. And also when I get to actually finally start talking about these other projects, they are going to be pretty exciting as well.

And altogether they're probably going to change a lot. In fact, they've already changed a lot for me in terms of how I'm building AI stuff, like really completely change everything. And so yeah, with that, I am pretty excited to actually start talking about that very soon. Anyhow, I want to talk about the V1 release of Semantic Router.

Since the inception of Semantic Router, a lot of what has been built into the library has been driven by requirements, whether that's my requirements, my team's requirements, or the community's requirements for particular features that we've needed to add. That can be things like encoders, indexes, routers, but also a lot of other smaller things like the dynamic routes, for example, where we trigger a route and then generate essentially like a tool use with an LLM.

So a lot of that has been very organic. And that has been quite useful in building Semantic Router out into a library that does things that you need it to do and not adding a load of fluff, which I think is important. Even though I do think we have a little more fluff than I would like, and part of the V1 is actually stripping that away.

But anyhow, there has been a lot of organic growth. And because of that, I find that there's a lot of things now that have needed to be refactored and cleaned up so that Semantic Router continues to be this, you know, very concise code base that is very sensible, very easy to use.

And I want us to really get to a point where if you want to go and add an index, a new index to Semantic Router, you only have to modify a few methods, you know, maybe just two call methods, your synchronous call method, your async call method. You modify both of those, you probably set up the connection to whatever index that is.

And then you just add a line within testing scripts, say, okay, there's a new index point to it. Okay. Of course, add your new dependency groups within the poetry files. And then that is it, right? That's all you should really need to do to be able to add a new index.

In many cases, of course, some cases are a bit more complicated, but that's the ideal. And a big part of V1 is kind of marching towards that and just make it much simpler to extend the library. And we are getting super close to that now. So I'll share this, I'll share my screen, this article, which kind of just outlines where we are with everything, okay?

Like what are the main focus points, the objectives of this V1 release? And where are we in achieving those objectives? The majority of these that we're getting, again, like I said, we're getting pretty close to. So I can just mention this actually. So the modular routers, encoders, and indexes.

That is a big one, right? So that's just what I just said, right? You want to add a new index, super easy. You want to add a new encoder. New encoders are actually already relatively easy to add, but we want to make it easier. And routers. Routers are the big one, actually.

So the routers were previously called route layers. And the default route layer was the only router that was actually truly supported for everything that the library does. Everything was kind of built around that core. We now call it a semantic router. But there are other routers, well, there's one other router, which is a hybrid router.

That's been in there for a long time, like really since very close to the beginning of the library. It was left behind. Because routers were just not abstracted enough. They were not modular. So actually the PR that I'm working on now is to do that. And I'm getting very close to that.

Actually they basically are already within that PR. It's already abstracted, everything is there. It's working through some testing issues, which are more to do with the test rather than the actual routers themselves. So that is getting pretty close. And one also big part that I do mention here is that we do plan to have many other routers in the future.

Like, there's a lot of stuff we can do with routers. Actually probably, I would say it's probably one of the parts of the library with the highest potential. And having this, these modular routers, that is going to unlock the ability for us to begin adding many other routers very easily without, this is also another very important part, we don't just add things for the sake of adding things, because you just get bloated, right?

We don't want any of that. A big part of this is keeping it as simple as possible. My goal with this is to remove more code than I add, at least within the actual code base, within the tests. I know even within the tests, I would like to try and reduce the amount of code there.

But that doesn't mean test less. Test more, but with less code. So yeah, that is the goal there. Synchronization logic, this has been a big one, and this is something that came from us implementing semantic router in production, right? So in production, a lot of the time we're using, for example, the Pinecone index, which is a remote index, and there is a problem with remote indexes in semantic router.

They're a bit harder to use, because as soon as you have a remote index, you have this remote instance of your data, and some of the metadata for that data that you have remotely does need to be in memory locally. And that can cause issues in synchronizing between those instances, particularly if you are treating one of those instances as like a source of truth.

So for example, if you have your remote index, you put all your data in there, and then you have multiple deployments where you have multiple local instances, they all need to be sharing that same local metadata that aligns with that remote index. And particularly, like, for example, the Pinecone index, that can be kind of hard to, not hard to do, it's not hard, but it can be slow to, if you go through and look at every single record and make sure it's all synchronized, that's slow.

And you can't, you don't want things to be slow. You can deal with it, we've been dealing with it, and it will work, but it's not ideal. So actually a lot of the work I've been doing is on the synchronization component. That is almost all done. I would say the final thing that I really need to do specifically on synchronization is one, I actually need to update this PR here, it links everything in here as well, but then the final one is route level synchronization.

So synchronization now works, you can synchronize asynchronously. We have lops between the remote and local instances, so that if, okay, synchronization is happening in your remote index, remote instance, and let's say we have those multiple local instances, that synchronization might be happening between, like, instance, local instance one, and the remote.

And then local instance two is like, okay, now I need to sync, I need to sync this new data I've just got. Right, it's not going to instantly do that, it's going to see, actually, your remote instance is locked, so it's not going to be allowed to just jump into that synchronization immediately.

Which is actually important, especially as we begin to use semantic router in larger projects, right, that's important with multiple instances all over the place. Async support for synchronizing, that's basically done, although there is, I think there was one, there's probably one other method that I do need to, might actually already be done, I think it is already done, sorry, but it's not ready for the hybrid router yet.

So that will actually be coming here, which is the current PR I'm working on, right? We also have this as well, this can be probably pretty useful if you want to go through it, so we have this syncing routes notebook, which explains everything. I will say all of this as well, it's in a dev version of the library, so if you do want to install this and start using it, right, you would use this.

Right now we're on dev five, very soon once I get that hybrid router PR out, it will be in dev six, so just be aware of that. If you do like a direct PIP install, because these are dev branches, PIP will not install this for you, right, PIP will go ahead and install this 0.0.72, which is already kind of old.

Okay, so that is synchronization, full async support, this is just super important for AI applications. With AI in particular, if you're using state-of-the-art language models, you spend a lot of your time waiting for API responses, and if you are writing your code, if you're writing synchronous code, your code is just waiting, right, most of that time, so your Python code, okay, gets to the point, it sends a request to your LLM, and then you're waiting like three seconds to get everything back and start going again.

With async code, when done properly, you send your request to your LLM, and then your Python code is basically free to do whatever it needs to do during those three seconds before it gets a response again, right, so your Python code can go start doing other stuff during that time, and of course, when you think about scaling AI applications, I think you need async, like I don't, I mean, unless you're doing a ton of multi-processing for everything all the time, which just isn't efficient, you do need async, I don't know how else you would do it, so, yeah, full async support is one thing that we've been very keen to get, and it's there, like, it is in there, it is, it's in the semantic router, it's in the Python index, one of the important ones, it's one of the main remote indexes, so that is basically there, I'm just adding that support to the hybrid router as well now.

Then we have upgrading the hybrid router, I kind of mentioned this, yeah, in progress, that's been a fairly, fairly big one, and aligning routers, right, so there were a lot of methods that were specific to either hybrid router or semantic router, they've mostly been cleaned up a lot now.

There is also one, I think basically the final thing that I have on this PR is just getting the fit and the eval methods working for the hybrid router, everything else is basically there. Again, there's test things I'm working through, they're absolute nuisance, but we're getting there. And then one of the things that has suffered, I will say, with the focus on V1 is that there are a few PRs out there that have just kind of been out there for a while, and basically we just need to get to those as soon as V1 is ready.

That does mean, for example, especially with the MOVIS index, there's just going to be a bit of work on my side to get all of that kind of up to date with the new V1 setup, but that is the priority, as soon as V1 is basically ready, I may even release V1 before I integrate this, so it may just be that this is going to be part of like a V1.01 or something, I'm not 100% sure, we'll see.

So just getting these into the library, there's a few others as well. There's like a Yandex index, I think, or Yandex GPT. I honestly have never used those services, I never even heard of them before I saw that PR. Yeah, if anyone is using those, let me know, just so I understand how popular those actually are.

Then testing and dubs, right? This is actually such a boring one, but I think just so important, like incredibly important, so obviously dubs, that's self explanatory, right? People need to know how to use your library, right? And this is something that we've been relatively poor on, to be fair, other than maybe these videos, and we do have a lot of like notebook examples within the repo, which I think are pretty useful, like if someone wants to know how to, if someone asks me how to do something, I usually send them a notebook and then they know how to do it, which I think is great.

But I think you need a more structure in finding that information and also a doc site so that you can just go into Google and say, okay, how do I do this? So that is something that we've been doing as well. So we do actually have a doc site that is there, it just needs more work, that's all, right?

So yeah, you have a few sort of almost like article type things here, we have synchronizing the route layer with indexes, which kind of covers a lot of stuff I mentioned, and then you have like the API reference, which goes through everything. Again, more work is needed there, but it's definitely a step in the right direction.

Then yes, merge your routes, that's what I'm doing now in that PR. Update all Jupyter notebooks for V1, also in progress. And then full mark of Pinecone index. Pinecone index is just such a nightmare in the test. And it's also just hard to, it's hard to fully mock Pinecone.

So that's just, that's its own thing that needs doing. That is going to make a big difference as well. So I would say a lot of the testing stuff has been done to some degree already. A lot of the modularity of tests, a lot of the cleanup of tests has been done.

Still more to do. I would say also some stuff that I'm missing here is probably mocking other encoders and indexes. I think that needs to be done a little bit, but for the most part they're ready. And yeah, I mean that is it for the current state of V1 of Symantec Router.

As I mentioned, and as you can probably see, there's a lot of stuff in there that's coming. You can already test it out, and there are some docs that have been updated already for V1. You can see those in here. So for example, this one, Pinecone Sync Routes, and all of this code here should work with the latest version of the library or the V1 version of the library.

So you can actually start going through that and testing it as well, but it's still in the dev state. There's still work being done. So be aware of that. There will be some things that are weird, but for the most part, we're very close. I think my goal is to have this done within the next few weeks, and then I can start talking about it and sharing some of the cool new stuff that we have in there.

So yeah, I mean that is it for the update. I won't go on any more about it. If anyone has any input on any of this stuff, anything that you feel like we're missing, we should cover in the V1 release or beyond, that's completely fine as well. Do let me know, but for now I'll leave it there.

So thank you very much for watching. I hope it's been useful, especially for those of you that are using Symantec Router. And yeah, thank you. I'll see you next time. Bye.

What's next for Semantic Router (v1 update)

Chapters

Transcript