back to index

Faster LLM Function Calling — Dynamic Routes


Chapters

0:0 Fast LLM Function Calling
0:56 Semantic Router Setup for LLMs
2:20 Function Calling Schema
4:4 Dynamic Routes for Function Calling
5:51 How we can use Faster Agents

Whisper Transcript | Transcript Only Page

00:00:00.000 | Today, we're going to take a look at dynamic routes
00:00:03.060 | in the semantic router library.
00:00:05.540 | Now, dynamic routes expand what we can do
00:00:08.940 | with this library by quite a lot.
00:00:11.300 | Unlike a static route, a dynamic route is able
00:00:14.340 | to dynamically generate the parameters based
00:00:18.260 | on a particular input that can then be taken
00:00:21.100 | into whatever you want to do with those parameters.
00:00:23.940 | So the main use case here is function calling.
00:00:27.340 | Now, the good thing with dynamic routes is
00:00:30.180 | that they can generate this output,
00:00:32.500 | but they're still very fast, just like our static routes.
00:00:36.300 | So they are fundamentally the same object.
00:00:38.900 | And I think what would be best is to just take a look
00:00:41.980 | at how they differ, which is not by a huge amount.
00:00:46.980 | Okay, so we're going to start in the docs
00:00:49.180 | of the semantic router library.
00:00:51.140 | I'm going to go to dynamic routes,
00:00:53.340 | and I'm just going to open that notebook in Colab.
00:00:56.900 | It's now on version 0.015.
00:01:00.780 | This is actually no longer necessary,
00:01:02.900 | so I need to remove that.
00:01:04.380 | So I'm going to install the library first.
00:01:09.220 | Then I'm going to come down to here,
00:01:10.800 | and I'm going to initialize a static route.
00:01:13.940 | Now, these are just basic static routes.
00:01:16.620 | And the reason we're loading those first is
00:01:18.600 | because we want to see what the difference is
00:01:20.040 | between these and a dynamic route.
00:01:23.060 | So yes, we initialize those.
00:01:26.020 | And then we're going to initialize our route layer.
00:01:28.900 | Now, the initialization of a route layer,
00:01:31.460 | whether you have dynamic or static routes or both,
00:01:34.180 | is exactly the same.
00:01:35.220 | It doesn't change.
00:01:36.580 | And again, we can use Cohere, we can use OpenAI.
00:01:39.360 | There's also now a new FastEmbed encoder as well
00:01:42.580 | if you want to run the embedding part locally.
00:01:46.160 | I'm going to use OpenAI because we will also want
00:01:48.740 | to use the OpenAI LLM as well.
00:01:51.580 | So API key, enter this.
00:01:55.540 | And there we go, okay?
00:01:58.980 | We do also support the Cohere LLM as well.
00:02:02.500 | And soon enough, you will also have local LLMs.
00:02:05.820 | But for now, I'm just going to use OpenAI.
00:02:09.540 | It's the easiest.
00:02:11.260 | Okay, so we can test that.
00:02:13.020 | It's working, and this is purely static routes.
00:02:15.400 | Let's see how we might create a dynamic route.
00:02:19.260 | So here is how we would set up our dynamic route.
00:02:23.620 | We don't need to do it like this directly.
00:02:26.820 | You can actually, sorry, this is the actual definition
00:02:30.500 | of our dynamic route.
00:02:31.820 | What I'm doing before here is creating this,
00:02:35.840 | the function schema that is required for our dynamic route.
00:02:40.000 | So the function schema, I can show you
00:02:43.260 | what that looks like maybe quickly.
00:02:45.980 | So if I run this, it looks like this.
00:02:48.540 | This is our function schema.
00:02:50.460 | Now, what I'm doing here with the get schema function here
00:02:55.460 | is I am taking an existing function
00:03:00.900 | and I'm formatting it in a particular way.
00:03:03.420 | So we're using the sphinx.string format here.
00:03:07.380 | And we're adding a lot of description
00:03:09.180 | as to how exactly this function should actually be used.
00:03:14.180 | So finds the current time in a specific time zone.
00:03:18.500 | Okay, that's like the description.
00:03:19.860 | Okay, what does this function actually do?
00:03:23.420 | We need this for our dynamic route to understand
00:03:26.740 | what this does and how it should be used.
00:03:29.100 | Then we specify, okay, we have our time zone.
00:03:31.660 | The type of our time zone is a string.
00:03:35.060 | And the description for it is this.
00:03:37.100 | Okay, so the time zone to find the current time in.
00:03:40.100 | It should be a valid time zone
00:03:41.420 | from the IANA time zone database.
00:03:45.300 | And then we get some examples.
00:03:47.460 | And then we specify, do not put the place name
00:03:50.380 | like Rome or New York.
00:03:52.700 | You must provide this particular IANA format.
00:03:56.300 | So we do that.
00:03:57.140 | That is then going to, okay, we provide this format
00:04:01.140 | and it's gonna give us a time in that particular place.
00:04:03.660 | Now, we run that function, get time,
00:04:08.260 | that we just created here.
00:04:10.060 | We put that through our get schema function here.
00:04:13.500 | We get our function schema output.
00:04:17.300 | And then this is what defines the difference
00:04:20.540 | between a static route and a dynamic route.
00:04:23.900 | We simply pass in this function schema
00:04:27.580 | to our route definition.
00:04:30.260 | So if I remove that, this is now a static route.
00:04:33.220 | If I add that back in, it's a dynamic route, okay?
00:04:39.020 | And that's all there is to it.
00:04:40.860 | So we have our new dynamic route.
00:04:43.980 | I'm gonna add it to our route layer.
00:04:46.980 | And then I'm going to ask a time-related question, okay?
00:04:51.860 | And that should trigger the time
00:04:54.280 | or the get time dynamic route.
00:04:56.900 | And it should hopefully get the right inputs
00:04:59.780 | for that route, okay?
00:05:03.980 | And we see that it does.
00:05:05.100 | So we have function call
00:05:06.380 | and we have these inputs for our function.
00:05:10.480 | So then I can connect this up
00:05:12.460 | to the function that we created.
00:05:15.500 | So say out equals this.
00:05:18.580 | And I want to say, get time.
00:05:21.900 | And it's out and it's the function call.
00:05:26.420 | And then obviously there's the, this.
00:05:32.280 | Okay, let's see what we get.
00:05:35.280 | Okay, six, 16.
00:05:38.980 | And basically you can expect this to work
00:05:43.940 | with any function call that you'd expect an LLM
00:05:47.500 | to normally be able to handle
00:05:49.460 | because we're using an LLM here.
00:05:51.340 | So what we're really doing
00:05:53.940 | is we're setting up that kind of like agentic workflow
00:05:58.940 | where an agent will decide what to do
00:06:02.180 | and then generate the input for whatever it decides to do.
00:06:06.420 | We're taking away the decision part
00:06:09.220 | on which route to take or which tool to use.
00:06:13.460 | And we're using semantics to make that decision,
00:06:18.020 | but we still rely on the LLM
00:06:21.340 | to generate the function call itself,
00:06:24.460 | which we've seen it does and it does pretty quickly.
00:06:28.340 | Now, that's it for this video.
00:06:31.460 | I hope this has been useful and interesting.
00:06:34.060 | So thank you very much for watching
00:06:36.540 | and I will see you again in the next one.
00:06:39.500 | (upbeat music)
00:06:42.940 | (soft music)
00:06:45.620 | (soft music)
00:06:48.060 | (soft music)