Back to Index

NEW AI Framework - Steerable Chatbots with Semantic Router


Chapters

0:0 New Python Library for Better AI
1:57 Using Semantic Router
2:26 Semantic Router in Python
3:8 Defining Guardrails and Routes for LLMs
4:34 Initializing a RouteLayer
7:39 Using the Router with LangChain Agents
11:47 What else can Semantic Router do
12:40 Final Notes on the Library

Transcript

Today, I finally get to talk to you about something that I and others have been working on for a very long time. That something is one of the secrets to how I build good AI assistants, agents, and simply more controllable, deterministic dialogue with AI. That something is what we call a semantic router.

Now a semantic router is something that you can think of as a almost fuzzy but deterministic layer on top of your chatbots or really anything that is processing natural language. The main purpose of the semantic router library is to act as a super fast decision making layer for LLMs.

So rather than saying to an LLM, you know, which tool should I use as we do with agents, it takes a long time when you do that. With semantic router, it's like in almost instant. It's incredibly fast. And the way that we set it up is more deterministic because we can provide a list of queries that should trigger a particular response or a particular tool usage or anything we can imagine.

And at the same time, that list of responses is represented within semantic vector space. So it's deterministic in that it will trigger if we hit one of those responses. We will also reach the responses around those queries or we call them utterances that we have defined. And I've been using this for chatbots and agents for this specific library for the past two months.

And honestly, the thought for me of agents and chatbots being deployed without this layer to have more deterministic control over the chatbots, I think is a little bit crazy. And I just would not ever put a chatbot out there without having a semantic routing layer. So with all that in mind, let's have a look at how we actually use this library.

So to get started with this semantic router library, we can first check out the repo. So it's AurelioLabs/semantic-router. And this gives you everything that you need to get started. We describe everything there. But if you really just want to jump straight into it, you can go to our introduction notebook here.

I'm going to open it up in Colab. And we will find ourselves here. So to get started, we just pip install the library. So right now we're on 0.0.14, which is basically one of the earliest versions. There's a lot of cool things that we'll be adding soon. And it's also open source.

So if people want to contribute, they can. Now, one thing that we have, particularly when using it with Google Colab at the moment, is that we'll have this annoying little thing that happens where we will need to restart after installing the prerequisites. Otherwise, we'll get this attribute error. So we just need to go restart session.

And then we run again. And what I'm first going to do is define some routes that we're going to use and we're going to test against. So the first one of those is going to be a protective route. So this is where you would probably want to add some guardrails to your chatbots or agents.

So maybe one of those would be you don't want it to begin talking about politics. So if a user asks a question that we would define as politics, we want it to trigger this route. And we can protect against that and we can return a specific predefined response or just remind the LLM to tell the user that you cannot talk about politics.

So we define the politics route. And then we'll just find another one so we can see kind of how they interact. This one's going to be a general sort of chit chat, small talk route. How's the weather today? How are things going? So on and so on. Right? So what we want to do is initialize an embedding model.

And you can either use Cohere or OpenAI. As I know many of you will be using OpenAI, we'll stick with OpenAI here. But I would actually recommend trying out Cohere's embedding models. They do work a little better in most use cases, at least that's what I've found. So for Cohere, you would go to dashboard.cohere.com.

For OpenAI, we naturally go to platform.openai.com. We'll get an API key. And I'm just going to run this cell and it will pop up with a little input box to tell me to input my API key. So I'm going to do that. Now we're ready to initialize what's called a route layer.

So a route layer is essentially a layer containing different routes. And it handles the decision making process as to whether we should go with one route or another route or no route. There is currently two route layers available in the library. The main one is the route layer. This is based on the idea of a pure semantic search.

We also have the hybrid route layer that we're still working on, still improving. But that will allow us to use both semantic space and also a more term-based traditional vector space as well. So that might be particularly useful for specific terminology like in the medical domain, in the finance domain, and other places as well.

For now, let's stick with the standard route layer and we can test it. So I'm going to run these three and let's see what we get. So don't you love politics? Our route choice, so this is the route that has been chosen, is the politics route. This function call, this is, that's related to our dynamic routes.

We'll talk about that more in the future. Now this, what we have here with the function call equal to none is what we call a static route. How's the weather today? Okay. So that's our chitchat. That obviously triggers our chitchat. And then I'm interested in learning about Lama too.

It's not really related to either of the routes that we've defined. So it returns none. Now let's go with something else. Maybe I want to ask about the agent's opinions on a particular political party. That's something that we don't want people doing in most cases. So I can say, okay, what do you think about the, in England we have the Labour Party, for example.

So what do you think about the Labour Party? See what it says. Okay, cool. We have politics. So we trigger that route. And then what we do obviously is actually, let me get the route choice from that. So our route, what we would do is we just do some, if else logic.

So if route name equals politics, we would return, hi, sorry, I can't talk about politics. Please go away or something along those lines, right? And then we're triggering that. And obviously we just have like this, if else logic that does different things. But then, okay, this is a very basic, this is the introduction.

Let me just show you very quickly how we might integrate this with a line chain agent. So returning to our docs, we have this number three basic line chain agent. Of course, we also have these as well, if you want to check them out, but let's go to the line chain agent first.

So I'm going to open again in Colab. So we have this notebook and I'll go through it in a little more detail later, but we have this system message. Okay. You're a helpful personal trainer, so on and so on. He's also acts like a noble British gentleman. And I had this little bit here.

Remember to read the system notes provided with your user queries. This is where I'm going to be inputting the logic from our semantic router. There are many different ways of doing this. This is just one of the ways that I quite like, to be honest, it's almost like you add a suggestive layer to your agents based on the semantic router.

So I've defined a agent here and I'm going to input my query. I want to know, should I buy optimum nutrition, whey, or my protein? Right? So I'm talking about whey protein and you can see up. Well, it depends, you prefer your way with a side of optimum nutrition or MP.

So I don't think it even knows what, Oh, okay. It knows that they're brands. Cool. So that's good. But I'm, you know, I want to say assistant to act like a personal trainer that has their own brand and all these other things. So what I've done is I've created a semantic router, augmented query.

What this has done, it's taken this query, process it through the semantic router. And then we've added this, uh, extra logic layer based on what the semantic router says, which adds different prompts to the user query based on that via the system note. So in this one, I've added one that talks about, okay, different types of proteins and products essentially.

And what it does is it says, okay, remember you are not affiliated with any supplement brands. We have our own brand big AI that sells the best products like P 100 whey protein. I don't know if anyone will get that. And it's a stupid joke, but I liked it.

So then the output becomes, why not try the big AI P 100 whey protein. It's the best, just like me, which is funny. So we have that. And then I have, I should show you the routes actually. So we have this get time route, which triggers a function supplement brand, which is the one we just saw, business acquiring product.

And one of those, obviously it's the time route where it's getting the current time for you and putting into your query, right? So without the semantic router, we're just putting this query in. Okay. Then we go through our semantic router layer and it augments our query with this. So then if we go with just the plain query, put that in, we got, it's generally recommended to allow at least 48 hours of rest and so on and so on.

It's not specific to the current time with this semantic router powered augmentation, we get this. Why not train again at exactly eight zero two tomorrow? That's the time that I asked this question, uh, but the day before that way you give your buddy a good rest, unless you're into those 24 hour gym life goals, which is a bit cringey.

But anyway, so you see that through the semantic router, we're allowed, able to suggest to our agent to take or, or to get this additional information like we have done here, or to suggest to the agent to act in a particular way. And then we have these other ones, you know, I can, uh, do you do training sessions without the, without the augmentation, there's nothing relevant here.

It's generally recommended, uh, actually, where is it? I'm here to provide guidance and support, not personal training sessions with the augmentation. Why, of course we offer these premium training sessions at just $700 per hour, which is what I told it to say. Now that's an example of what semantic router can do.

It's just one example. There are many different things that you can do with this. What I've just shown you there is using these routes to remind the agent of particular information or to, you know, call a function, but we can also use it to protect against certain queries. We can use it to basically do function calling without the super slow agent processing time that function calling requires.

And we can also use this, and this is one of the things that I use it for a lot as another approach to rag. You know, in the past I've spoken about this naive rag, which is where you're performing a search every query, you have the agent based rag, which is slower, but it can usually do a bit more.

It's more powerful. We also have this, which is kind of like the semantic router rag or semantic rag, but it takes both. It can be very powerful like your agent, but it can also be very fast like your naive rag. So it really gets the best of both and it's generally my preferred way of doing it.

So that is the semantic router. As I said, I and my team have been implementing this across many projects. So we, you know, we've been implementing it, seeing what works, seeing what doesn't work and fine tuning it based on that. And I think what we have here is the first version, okay, 100%, this is still a very early version, but it works incredibly well.

It's truly getting us that final 20% of the AI behaviors that we need in order to make our agents something that we can actually go ahead and use in production, which is very cool to see. And we want other people to be able to use this as well, which is why you're seeing this right now.

I personally, I'm very excited about releasing this. So I hope that this is exciting for at least a few of you. I hope some of you get to use it and, you know, please let me know what you think. If you are interested in contributing, it's all open source so you can, and I'll be doing a few more videos for sure on how we use this, how to make the most of the semantic router and especially the other features that I haven't spoken about yet, such as dynamic routing, the hybrid layer, those are all very exciting things and we have many more exciting things coming as well.

So I hope all of this has been exciting and interesting, but for now I will leave it there. So thank you very much for watching and I will see you again in the next one. Bye. you