back to indexNEW AI Framework - Steerable Chatbots with Semantic Router
Chapters
0:0 New Python Library for Better AI
1:57 Using Semantic Router
2:26 Semantic Router in Python
3:8 Defining Guardrails and Routes for LLMs
4:34 Initializing a RouteLayer
7:39 Using the Router with LangChain Agents
11:47 What else can Semantic Router do
12:40 Final Notes on the Library
00:00:00.000 |
Today, I finally get to talk to you about something that I and others have been working 00:00:06.660 |
That something is one of the secrets to how I build good AI assistants, agents, and simply 00:00:17.520 |
more controllable, deterministic dialogue with AI. 00:00:21.640 |
That something is what we call a semantic router. 00:00:25.320 |
Now a semantic router is something that you can think of as a almost fuzzy but deterministic 00:00:32.000 |
layer on top of your chatbots or really anything that is processing natural language. 00:00:39.240 |
The main purpose of the semantic router library is to act as a super fast decision making 00:00:47.600 |
So rather than saying to an LLM, you know, which tool should I use as we do with agents, 00:00:54.260 |
With semantic router, it's like in almost instant. 00:00:59.860 |
And the way that we set it up is more deterministic because we can provide a list of queries that 00:01:05.680 |
should trigger a particular response or a particular tool usage or anything we can imagine. 00:01:11.800 |
And at the same time, that list of responses is represented within semantic vector space. 00:01:18.780 |
So it's deterministic in that it will trigger if we hit one of those responses. 00:01:23.720 |
We will also reach the responses around those queries or we call them utterances that we 00:01:30.880 |
And I've been using this for chatbots and agents for this specific library for the past 00:01:37.120 |
And honestly, the thought for me of agents and chatbots being deployed without this layer 00:01:45.140 |
to have more deterministic control over the chatbots, I think is a little bit crazy. 00:01:51.180 |
And I just would not ever put a chatbot out there without having a semantic routing layer. 00:01:57.820 |
So with all that in mind, let's have a look at how we actually use this library. 00:02:04.520 |
So to get started with this semantic router library, we can first check out the repo. 00:02:14.560 |
And this gives you everything that you need to get started. 00:02:20.040 |
But if you really just want to jump straight into it, you can go to our introduction notebook 00:02:29.040 |
So to get started, we just pip install the library. 00:02:33.280 |
So right now we're on 0.0.14, which is basically one of the earliest versions. 00:02:39.800 |
There's a lot of cool things that we'll be adding soon. 00:02:48.400 |
Now, one thing that we have, particularly when using it with Google Colab at the moment, 00:02:54.600 |
is that we'll have this annoying little thing that happens where we will need to restart 00:03:08.920 |
And what I'm first going to do is define some routes that we're going to use and we're going 00:03:15.080 |
So the first one of those is going to be a protective route. 00:03:17.760 |
So this is where you would probably want to add some guardrails to your chatbots or agents. 00:03:27.380 |
So maybe one of those would be you don't want it to begin talking about politics. 00:03:32.380 |
So if a user asks a question that we would define as politics, we want it to trigger 00:03:39.000 |
And we can protect against that and we can return a specific predefined response or just 00:03:45.880 |
remind the LLM to tell the user that you cannot talk about politics. 00:03:51.880 |
And then we'll just find another one so we can see kind of how they interact. 00:03:55.760 |
This one's going to be a general sort of chit chat, small talk route. 00:04:05.840 |
So what we want to do is initialize an embedding model. 00:04:12.720 |
As I know many of you will be using OpenAI, we'll stick with OpenAI here. 00:04:17.280 |
But I would actually recommend trying out Cohere's embedding models. 00:04:21.440 |
They do work a little better in most use cases, at least that's what I've found. 00:04:27.440 |
So for Cohere, you would go to dashboard.cohere.com. 00:04:30.560 |
For OpenAI, we naturally go to platform.openai.com. 00:04:36.600 |
And I'm just going to run this cell and it will pop up with a little input box to tell 00:04:45.840 |
Now we're ready to initialize what's called a route layer. 00:04:50.680 |
So a route layer is essentially a layer containing different routes. 00:04:56.000 |
And it handles the decision making process as to whether we should go with one route 00:05:03.000 |
There is currently two route layers available in the library. 00:05:09.020 |
This is based on the idea of a pure semantic search. 00:05:14.080 |
We also have the hybrid route layer that we're still working on, still improving. 00:05:18.680 |
But that will allow us to use both semantic space and also a more term-based traditional 00:05:28.420 |
So that might be particularly useful for specific terminology like in the medical domain, in 00:05:35.200 |
the finance domain, and other places as well. 00:05:38.760 |
For now, let's stick with the standard route layer and we can test it. 00:05:43.600 |
So I'm going to run these three and let's see what we get. 00:05:52.000 |
Our route choice, so this is the route that has been chosen, is the politics route. 00:05:56.920 |
This function call, this is, that's related to our dynamic routes. 00:06:03.920 |
Now this, what we have here with the function call equal to none is what we call a static 00:06:15.340 |
And then I'm interested in learning about Lama too. 00:06:17.440 |
It's not really related to either of the routes that we've defined. 00:06:26.320 |
Maybe I want to ask about the agent's opinions on a particular political party. 00:06:31.720 |
That's something that we don't want people doing in most cases. 00:06:36.440 |
So I can say, okay, what do you think about the, in England we have the Labour Party, 00:06:58.440 |
And then what we do obviously is actually, let me get the route choice from that. 00:07:04.240 |
So our route, what we would do is we just do some, if else logic. 00:07:10.160 |
So if route name equals politics, we would return, hi, sorry, I can't talk about politics. 00:07:25.000 |
Please go away or something along those lines, right? 00:07:31.040 |
And obviously we just have like this, if else logic that does different things. 00:07:36.220 |
But then, okay, this is a very basic, this is the introduction. 00:07:40.040 |
Let me just show you very quickly how we might integrate this with a line chain agent. 00:07:44.960 |
So returning to our docs, we have this number three basic line chain agent. 00:07:49.200 |
Of course, we also have these as well, if you want to check them out, but let's go to 00:07:57.140 |
So we have this notebook and I'll go through it in a little more detail later, but we have 00:08:05.520 |
You're a helpful personal trainer, so on and so on. 00:08:08.520 |
He's also acts like a noble British gentleman. 00:08:15.400 |
Remember to read the system notes provided with your user queries. 00:08:18.380 |
This is where I'm going to be inputting the logic from our semantic router. 00:08:24.640 |
This is just one of the ways that I quite like, to be honest, it's almost like you add 00:08:29.160 |
a suggestive layer to your agents based on the semantic router. 00:08:34.380 |
So I've defined a agent here and I'm going to input my query. 00:08:38.640 |
I want to know, should I buy optimum nutrition, whey, or my protein? 00:08:45.600 |
So I'm talking about whey protein and you can see up. 00:08:48.800 |
Well, it depends, you prefer your way with a side of optimum nutrition or MP. 00:08:53.880 |
So I don't think it even knows what, Oh, okay. 00:09:00.520 |
But I'm, you know, I want to say assistant to act like a personal trainer that has their 00:09:06.560 |
So what I've done is I've created a semantic router, augmented query. 00:09:12.760 |
What this has done, it's taken this query, process it through the semantic router. 00:09:16.240 |
And then we've added this, uh, extra logic layer based on what the semantic router says, 00:09:21.400 |
which adds different prompts to the user query based on that via the system note. 00:09:27.880 |
So in this one, I've added one that talks about, okay, different types of proteins and 00:09:34.800 |
And what it does is it says, okay, remember you are not affiliated with any supplement 00:09:40.120 |
We have our own brand big AI that sells the best products like P 100 whey protein. 00:09:49.920 |
So then the output becomes, why not try the big AI P 100 whey protein. 00:10:05.600 |
And then I have, I should show you the routes actually. 00:10:08.960 |
So we have this get time route, which triggers a function supplement brand, which is the 00:10:16.760 |
And one of those, obviously it's the time route where it's getting the current time 00:10:22.280 |
So without the semantic router, we're just putting this query in. 00:10:28.360 |
Then we go through our semantic router layer and it augments our query with this. 00:10:33.960 |
So then if we go with just the plain query, put that in, we got, it's generally recommended 00:10:39.760 |
to allow at least 48 hours of rest and so on and so on. 00:10:42.840 |
It's not specific to the current time with this semantic router powered augmentation, 00:10:48.800 |
Why not train again at exactly eight zero two tomorrow? 00:10:52.240 |
That's the time that I asked this question, uh, but the day before that way you give your 00:10:56.880 |
buddy a good rest, unless you're into those 24 hour gym life goals, which is a bit cringey. 00:11:02.720 |
But anyway, so you see that through the semantic router, we're allowed, able to suggest to 00:11:08.200 |
our agent to take or, or to get this additional information like we have done here, or to 00:11:13.720 |
suggest to the agent to act in a particular way. 00:11:17.080 |
And then we have these other ones, you know, I can, uh, do you do training sessions without 00:11:21.760 |
the, without the augmentation, there's nothing relevant here. 00:11:26.480 |
It's generally recommended, uh, actually, where is it? 00:11:29.120 |
I'm here to provide guidance and support, not personal training sessions with the augmentation. 00:11:33.160 |
Why, of course we offer these premium training sessions at just $700 per hour, which is what 00:11:41.600 |
Now that's an example of what semantic router can do. 00:11:45.720 |
There are many different things that you can do with this. 00:11:47.800 |
What I've just shown you there is using these routes to remind the agent of particular information 00:11:52.200 |
or to, you know, call a function, but we can also use it to protect against certain queries. 00:11:59.320 |
We can use it to basically do function calling without the super slow agent processing time 00:12:07.260 |
And we can also use this, and this is one of the things that I use it for a lot as another 00:12:13.640 |
You know, in the past I've spoken about this naive rag, which is where you're performing 00:12:17.360 |
a search every query, you have the agent based rag, which is slower, but it can usually do 00:12:25.600 |
We also have this, which is kind of like the semantic router rag or semantic rag, but it 00:12:31.960 |
It can be very powerful like your agent, but it can also be very fast like your naive rag. 00:12:36.640 |
So it really gets the best of both and it's generally my preferred way of doing it. 00:12:45.240 |
As I said, I and my team have been implementing this across many projects. 00:12:52.120 |
So we, you know, we've been implementing it, seeing what works, seeing what doesn't work 00:12:58.760 |
And I think what we have here is the first version, okay, 100%, this is still a very 00:13:08.640 |
It's truly getting us that final 20% of the AI behaviors that we need in order to make 00:13:15.840 |
our agents something that we can actually go ahead and use in production, which is very 00:13:22.040 |
And we want other people to be able to use this as well, which is why you're seeing this 00:13:28.480 |
I personally, I'm very excited about releasing this. 00:13:31.480 |
So I hope that this is exciting for at least a few of you. 00:13:36.120 |
I hope some of you get to use it and, you know, please let me know what you think. 00:13:41.160 |
If you are interested in contributing, it's all open source so you can, and I'll be doing 00:13:46.840 |
a few more videos for sure on how we use this, how to make the most of the semantic router 00:13:53.600 |
and especially the other features that I haven't spoken about yet, such as dynamic routing, 00:13:57.960 |
the hybrid layer, those are all very exciting things and we have many more exciting things 00:14:05.200 |
So I hope all of this has been exciting and interesting, but for now I will leave it there. 00:14:11.340 |
So thank you very much for watching and I will see you again in the next one.