back to index

What's next for Semantic Router (v1 update)


Chapters

0:0 Semantic Router
1:13 Keeping Semantic Router Lightweight
3:32 Current State of SR v1
3:58 Modular Routers Encoders and Indexes
6:27 Semantic Router Synchronization
10:30 Full Async Support
12:7 HybridRouter Upgrades
12:46 New Semantic Router Integrations
13:55 Testing and Doc Upgrades
16:4 Getting Started with v1

Whisper Transcript | Transcript Only Page

00:00:00.000 | Today I wanted to give you a update on mainly a progress of Semantic Router or the V1 major
00:00:08.480 | release of Semantic Router.
00:00:09.780 | Now I know I've been a little quieter on YouTube recently.
00:00:14.440 | A lot of that is in part due to Semantic Router, this new release.
00:00:19.020 | In part due to a lot of other projects that are coming, but I'm not quite ready to start
00:00:24.680 | talking about those just yet.
00:00:26.680 | But the V1 of Semantic Router plus these other projects has made me pretty busy and just
00:00:32.920 | unable to do all that much in terms of videos.
00:00:36.640 | But when I do get to release V1, that will be huge and I'll explain why in a minute.
00:00:44.240 | And also when I get to actually finally start talking about these other projects, they are
00:00:48.360 | going to be pretty exciting as well.
00:00:50.960 | And altogether they're probably going to change a lot.
00:00:55.160 | In fact, they've already changed a lot for me in terms of how I'm building AI stuff,
00:01:01.200 | like really completely change everything.
00:01:04.680 | And so yeah, with that, I am pretty excited to actually start talking about that very
00:01:09.000 | soon.
00:01:10.000 | Anyhow, I want to talk about the V1 release of Semantic Router.
00:01:14.000 | Since the inception of Semantic Router, a lot of what has been built into the library
00:01:22.360 | has been driven by requirements, whether that's my requirements, my team's requirements, or
00:01:29.240 | the community's requirements for particular features that we've needed to add.
00:01:34.680 | That can be things like encoders, indexes, routers, but also a lot of other smaller things
00:01:41.920 | like the dynamic routes, for example, where we trigger a route and then generate essentially
00:01:49.280 | like a tool use with an LLM.
00:01:52.640 | So a lot of that has been very organic.
00:01:55.480 | And that has been quite useful in building Semantic Router out into a library that does
00:02:01.200 | things that you need it to do and not adding a load of fluff, which I think is important.
00:02:06.260 | Even though I do think we have a little more fluff than I would like, and part of the V1
00:02:10.480 | is actually stripping that away.
00:02:12.400 | But anyhow, there has been a lot of organic growth.
00:02:16.520 | And because of that, I find that there's a lot of things now that have needed to be refactored
00:02:23.040 | and cleaned up so that Semantic Router continues to be this, you know, very concise code base
00:02:29.480 | that is very sensible, very easy to use.
00:02:35.120 | And I want us to really get to a point where if you want to go and add an index, a new
00:02:40.300 | index to Semantic Router, you only have to modify a few methods, you know, maybe just
00:02:48.120 | two call methods, your synchronous call method, your async call method.
00:02:52.320 | You modify both of those, you probably set up the connection to whatever index that is.
00:02:57.960 | And then you just add a line within testing scripts, say, okay, there's a new index point
00:03:02.880 | to it.
00:03:03.880 | Okay.
00:03:04.880 | Of course, add your new dependency groups within the poetry files.
00:03:09.280 | And then that is it, right?
00:03:11.520 | That's all you should really need to do to be able to add a new index.
00:03:15.880 | In many cases, of course, some cases are a bit more complicated, but that's the ideal.
00:03:22.320 | And a big part of V1 is kind of marching towards that and just make it much simpler to extend
00:03:28.280 | the library.
00:03:29.280 | And we are getting super close to that now.
00:03:32.120 | So I'll share this, I'll share my screen, this article, which kind of just outlines
00:03:40.840 | where we are with everything, okay?
00:03:42.800 | Like what are the main focus points, the objectives of this V1 release?
00:03:48.880 | And where are we in achieving those objectives?
00:03:52.520 | The majority of these that we're getting, again, like I said, we're getting pretty close
00:03:56.760 | So I can just mention this actually.
00:03:58.720 | So the modular routers, encoders, and indexes.
00:04:01.760 | That is a big one, right?
00:04:02.760 | So that's just what I just said, right?
00:04:05.640 | You want to add a new index, super easy.
00:04:07.960 | You want to add a new encoder.
00:04:09.520 | New encoders are actually already relatively easy to add, but we want to make it easier.
00:04:16.040 | And routers.
00:04:17.680 | Routers are the big one, actually.
00:04:19.940 | So the routers were previously called route layers.
00:04:24.000 | And the default route layer was the only router that was actually truly supported for everything
00:04:33.640 | that the library does.
00:04:35.840 | Everything was kind of built around that core.
00:04:38.320 | We now call it a semantic router.
00:04:40.520 | But there are other routers, well, there's one other router, which is a hybrid router.
00:04:45.160 | That's been in there for a long time, like really since very close to the beginning of
00:04:49.160 | the library.
00:04:50.160 | It was left behind.
00:04:51.820 | Because routers were just not abstracted enough.
00:04:55.120 | They were not modular.
00:04:57.440 | So actually the PR that I'm working on now is to do that.
00:05:02.400 | And I'm getting very close to that.
00:05:04.740 | Actually they basically are already within that PR.
00:05:09.600 | It's already abstracted, everything is there.
00:05:12.640 | It's working through some testing issues, which are more to do with the test rather
00:05:16.880 | than the actual routers themselves.
00:05:19.060 | So that is getting pretty close.
00:05:23.240 | And one also big part that I do mention here is that we do plan to have many other routers
00:05:28.160 | in the future.
00:05:29.160 | Like, there's a lot of stuff we can do with routers.
00:05:33.000 | Actually probably, I would say it's probably one of the parts of the library with the highest
00:05:37.840 | potential.
00:05:40.040 | And having this, these modular routers, that is going to unlock the ability for us to begin
00:05:48.000 | adding many other routers very easily without, this is also another very important part,
00:05:56.560 | we don't just add things for the sake of adding things, because you just get bloated, right?
00:06:01.520 | We don't want any of that.
00:06:02.520 | A big part of this is keeping it as simple as possible.
00:06:06.040 | My goal with this is to remove more code than I add, at least within the actual code base,
00:06:11.360 | within the tests.
00:06:12.360 | I know even within the tests, I would like to try and reduce the amount of code there.
00:06:20.060 | But that doesn't mean test less.
00:06:22.360 | Test more, but with less code.
00:06:24.460 | So yeah, that is the goal there.
00:06:27.940 | Synchronization logic, this has been a big one, and this is something that came from
00:06:30.920 | us implementing semantic router in production, right?
00:06:34.960 | So in production, a lot of the time we're using, for example, the Pinecone index, which
00:06:39.360 | is a remote index, and there is a problem with remote indexes in semantic router.
00:06:47.080 | They're a bit harder to use, because as soon as you have a remote index, you have this
00:06:52.920 | remote instance of your data, and some of the metadata for that data that you have remotely
00:06:58.220 | does need to be in memory locally.
00:07:00.680 | And that can cause issues in synchronizing between those instances, particularly if you
00:07:09.000 | are treating one of those instances as like a source of truth.
00:07:12.360 | So for example, if you have your remote index, you put all your data in there, and then you
00:07:20.520 | have multiple deployments where you have multiple local instances, they all need to be sharing
00:07:28.760 | that same local metadata that aligns with that remote index.
00:07:34.980 | And particularly, like, for example, the Pinecone index, that can be kind of hard to, not hard
00:07:41.080 | to do, it's not hard, but it can be slow to, if you go through and look at every single
00:07:47.340 | record and make sure it's all synchronized, that's slow.
00:07:54.400 | And you can't, you don't want things to be slow.
00:07:57.400 | You can deal with it, we've been dealing with it, and it will work, but it's not ideal.
00:08:02.200 | So actually a lot of the work I've been doing is on the synchronization component.
00:08:06.840 | That is almost all done.
00:08:08.440 | I would say the final thing that I really need to do specifically on synchronization
00:08:14.880 | is one, I actually need to update this PR here, it links everything in here as well,
00:08:21.480 | but then the final one is route level synchronization.
00:08:25.400 | So synchronization now works, you can synchronize asynchronously.
00:08:29.600 | We have lops between the remote and local instances, so that if, okay, synchronization
00:08:36.640 | is happening in your remote index, remote instance, and let's say we have those multiple
00:08:43.280 | local instances, that synchronization might be happening between, like, instance, local
00:08:48.800 | instance one, and the remote.
00:08:51.800 | And then local instance two is like, okay, now I need to sync, I need to sync this new
00:08:55.560 | data I've just got.
00:08:57.880 | Right, it's not going to instantly do that, it's going to see, actually, your remote instance
00:09:03.360 | is locked, so it's not going to be allowed to just jump into that synchronization immediately.
00:09:11.680 | Which is actually important, especially as we begin to use semantic router in larger
00:09:18.600 | projects, right, that's important with multiple instances all over the place.
00:09:23.400 | Async support for synchronizing, that's basically done, although there is, I think there was
00:09:28.800 | one, there's probably one other method that I do need to, might actually already be done,
00:09:34.760 | I think it is already done, sorry, but it's not ready for the hybrid router yet.
00:09:40.320 | So that will actually be coming here, which is the current PR I'm working on, right?
00:09:44.720 | We also have this as well, this can be probably pretty useful if you want to go through it,
00:09:50.280 | so we have this syncing routes notebook, which explains everything.
00:09:56.880 | I will say all of this as well, it's in a dev version of the library, so if you do want
00:10:02.000 | to install this and start using it, right, you would use this.
00:10:05.720 | Right now we're on dev five, very soon once I get that hybrid router PR out, it will be
00:10:11.000 | in dev six, so just be aware of that.
00:10:14.800 | If you do like a direct PIP install, because these are dev branches, PIP will not install
00:10:19.560 | this for you, right, PIP will go ahead and install this 0.0.72, which is already kind
00:10:26.720 | of old.
00:10:27.720 | Okay, so that is synchronization, full async support, this is just super important for
00:10:34.040 | AI applications.
00:10:35.800 | With AI in particular, if you're using state-of-the-art language models, you spend a lot of your time
00:10:41.360 | waiting for API responses, and if you are writing your code, if you're writing synchronous
00:10:46.840 | code, your code is just waiting, right, most of that time, so your Python code, okay, gets
00:10:53.600 | to the point, it sends a request to your LLM, and then you're waiting like three seconds
00:10:57.560 | to get everything back and start going again.
00:11:00.300 | With async code, when done properly, you send your request to your LLM, and then your Python
00:11:07.980 | code is basically free to do whatever it needs to do during those three seconds before it
00:11:12.620 | gets a response again, right, so your Python code can go start doing other stuff during
00:11:18.180 | that time, and of course, when you think about scaling AI applications, I think you need
00:11:26.500 | async, like I don't, I mean, unless you're doing a ton of multi-processing for everything
00:11:32.700 | all the time, which just isn't efficient, you do need async, I don't know how else you
00:11:42.380 | would do it, so, yeah, full async support is one thing that we've been very keen to
00:11:49.060 | get, and it's there, like, it is in there, it is, it's in the semantic router, it's in
00:11:55.520 | the Python index, one of the important ones, it's one of the main remote indexes, so that
00:12:02.240 | is basically there, I'm just adding that support to the hybrid router as well now.
00:12:07.300 | Then we have upgrading the hybrid router, I kind of mentioned this, yeah, in progress,
00:12:13.100 | that's been a fairly, fairly big one, and aligning routers, right, so there were a lot
00:12:18.740 | of methods that were specific to either hybrid router or semantic router, they've mostly
00:12:24.820 | been cleaned up a lot now. There is also one, I think basically the final thing that I have
00:12:31.460 | on this PR is just getting the fit and the eval methods working for the hybrid router,
00:12:36.780 | everything else is basically there. Again, there's test things I'm working through, they're
00:12:43.340 | absolute nuisance, but we're getting there. And then one of the things that has suffered,
00:12:49.360 | I will say, with the focus on V1 is that there are a few PRs out there that have just kind
00:12:57.140 | of been out there for a while, and basically we just need to get to those as soon as V1
00:13:03.660 | is ready. That does mean, for example, especially with the MOVIS index, there's just going to
00:13:09.500 | be a bit of work on my side to get all of that kind of up to date with the new V1 setup,
00:13:18.700 | but that is the priority, as soon as V1 is basically ready, I may even release V1 before
00:13:25.940 | I integrate this, so it may just be that this is going to be part of like a V1.01 or something,
00:13:32.580 | I'm not 100% sure, we'll see. So just getting these into the library, there's a few others
00:13:38.720 | as well. There's like a Yandex index, I think, or Yandex GPT. I honestly have never used
00:13:47.140 | those services, I never even heard of them before I saw that PR. Yeah, if anyone is using
00:13:51.740 | those, let me know, just so I understand how popular those actually are. Then testing and
00:13:57.300 | dubs, right? This is actually such a boring one, but I think just so important, like incredibly
00:14:06.220 | important, so obviously dubs, that's self explanatory, right? People need to know how
00:14:11.980 | to use your library, right? And this is something that we've been relatively poor on, to be
00:14:16.380 | fair, other than maybe these videos, and we do have a lot of like notebook examples within
00:14:21.700 | the repo, which I think are pretty useful, like if someone wants to know how to, if someone
00:14:25.460 | asks me how to do something, I usually send them a notebook and then they know how to
00:14:29.660 | do it, which I think is great. But I think you need a more structure in finding that
00:14:36.060 | information and also a doc site so that you can just go into Google and say, okay, how
00:14:39.580 | do I do this? So that is something that we've been doing as well. So we do actually have
00:14:45.780 | a doc site that is there, it just needs more work, that's all, right? So yeah, you have
00:14:53.780 | a few sort of almost like article type things here, we have synchronizing the route layer
00:14:59.540 | with indexes, which kind of covers a lot of stuff I mentioned, and then you have like
00:15:04.580 | the API reference, which goes through everything. Again, more work is needed there, but it's
00:15:10.380 | definitely a step in the right direction. Then yes, merge your routes, that's what I'm
00:15:16.300 | doing now in that PR. Update all Jupyter notebooks for V1, also in progress. And then full mark
00:15:24.460 | of Pinecone index. Pinecone index is just such a nightmare in the test. And it's also
00:15:30.340 | just hard to, it's hard to fully mock Pinecone. So that's just, that's its own thing that
00:15:37.460 | needs doing. That is going to make a big difference as well. So I would say a lot of the testing
00:15:43.500 | stuff has been done to some degree already. A lot of the modularity of tests, a lot of
00:15:48.420 | the cleanup of tests has been done. Still more to do. I would say also some stuff that
00:15:53.380 | I'm missing here is probably mocking other encoders and indexes. I think that needs to
00:15:57.900 | be done a little bit, but for the most part they're ready. And yeah, I mean that is it
00:16:03.500 | for the current state of V1 of Symantec Router. As I mentioned, and as you can probably see,
00:16:10.220 | there's a lot of stuff in there that's coming. You can already test it out, and there are
00:16:15.020 | some docs that have been updated already for V1. You can see those in here. So for example,
00:16:24.060 | this one, Pinecone Sync Routes, and all of this code here should work with the latest
00:16:28.980 | version of the library or the V1 version of the library. So you can actually start going
00:16:32.420 | through that and testing it as well, but it's still in the dev state. There's still work
00:16:38.780 | being done. So be aware of that. There will be some things that are weird, but for the
00:16:44.020 | most part, we're very close. I think my goal is to have this done within the next few weeks,
00:16:51.820 | and then I can start talking about it and sharing some of the cool new stuff that we
00:16:55.020 | have in there.
00:16:56.020 | So yeah, I mean that is it for the update. I won't go on any more about it. If anyone
00:17:04.100 | has any input on any of this stuff, anything that you feel like we're missing, we should
00:17:08.940 | cover in the V1 release or beyond, that's completely fine as well. Do let me know, but
00:17:17.860 | for now I'll leave it there. So thank you very much for watching. I hope it's been useful,
00:17:22.100 | especially for those of you that are using Symantec Router. And yeah, thank you. I'll
00:17:27.740 | see you next time. Bye.
00:17:29.300 | [MUSIC]