Multi-Agent Systems in OpenAI's Agents SDK

00:00:00.000 | Today, we're going to be taking a look at multi-agent workflows in OpenAI's Agents SDK.

00:00:05.780 | Now, OpenAI's Agents SDK is the production version of their earlier open source package

00:00:14.120 | And what Swarm was focused on doing was literally building agent Swarm.

00:00:20.100 | So you could imagine that the successor of OpenAI's agentic Swarm package

00:00:29.380 | has fairly strong support for multi-agent systems, and that would be accurate.

00:00:35.900 | Working multiple agents in Agents SDK is incredibly easy, works very well, and generally quite flexible.

00:00:42.740 | Now, within the SDK, there are two primary approaches that you might take to building a multi-agent system.

00:00:49.820 | The first of those, which is what we'll be focusing on today, is the orchestrator sub-agent pattern.

00:00:56.420 | And the other is using agent handoffs, which we will not be covering in this video, but I will talk about in another video.

00:01:04.920 | So let's begin by taking a look at this orchestrator sub-agent pattern.

00:01:11.820 | Now, everything we are going to cover is going to be in various places.

00:01:15.940 | So we have this article on the Aurelio AI site.

00:01:19.180 | This is a chapter in our upcoming Agents SDK course, and this covers both the orchestrator sub-agent pattern and also using handoffs.

00:01:29.520 | So this covers everything.

00:01:31.440 | So you can follow this, or alternatively, and I think this is where most of us will probably go,

00:01:38.540 | we can go to the Aurelio Labs Agents SDK course, go to chapters, and 0-4 multi-agent.

00:01:45.260 | In here, we have all the code that I'm going to work through.

00:01:48.020 | I would recommend to open this in CodeLab.

00:01:51.660 | CodeLab is just the easiest to set up.

00:01:54.560 | So you would literally click this little button here.

00:01:56.800 | You click on this cell here.

00:01:59.780 | Run this, and you are set up and ready to go.

00:02:03.540 | The other way that you can run this is locally.

00:02:06.420 | So you get cloned the repo, and there are set up instructions for how you can do this in the repo readme.

00:02:14.640 | And I'm actually going to be going and running all this locally, because I already have everything set up.

00:02:19.940 | And it just looks nicer when everything is on my local code editor.

00:02:27.700 | So the first thing that we're going to do before we jump into the orchestrator sub-agent is we need to set our API key.

00:02:37.340 | So we're using OpenAI.

00:02:39.880 | So you set your OpenAI API key.

00:02:41.840 | You need to get this from the OpenAI platform, of course.

00:02:44.680 | And you just paste your API key in the top here.

00:02:47.600 | Or I think it's actually below the cell if you're running this in CodeLab.

00:02:50.980 | So orchestrator sub-agent, what does that look like?

00:02:54.180 | I have this nice little visual here.

00:02:56.220 | So we can see here, we have this human orchestrator, and then we have these sub-agents below the orchestrator.

00:03:02.200 | The orchestrator sub-agent pan I'm talking about here is where we have a main agent, i.e. the orchestrator.

00:03:13.700 | And essentially everything goes through this orchestrator.

00:03:17.280 | That is why it is the orchestrator.

00:03:19.300 | It is orchestrating everything that is going on when we are interacting with this agent workflow.

00:03:25.560 | The orchestrator is what we communicate with as a human.

00:03:30.780 | Okay.

00:03:31.360 | So human, we send our question.

00:03:33.960 | That will come down to the orchestrator.

00:03:36.600 | The orchestrator is going to see that and the orchestrator will decide what to do.

00:03:41.640 | Will it refer to a sub-agent for some additional information?

00:03:46.280 | Will it respond directly?

00:03:48.440 | It can do either of those.

00:03:50.480 | It can also refer to multiple sub-agents.

00:03:52.960 | So, for example, let's say our question is, based on NVIDIA's latest earnings, what is their PTE ratio right now?

00:04:06.320 | Okay.

00:04:06.740 | That could be an example.

00:04:08.940 | In that case, we would want to get the financial data from their latest earnings report using the web search sub-agent.

00:04:17.700 | We would get that information and then we would use that, pass it over to our code execution sub-agent, which would perform some calculations for us to calculate that ratio.

00:04:29.180 | And then that would be passed back to the orchestrator and the orchestrator would pass that information back to us.

00:04:34.820 | Now, alternatively, if you say, hello, how are you, the orchestrator doesn't need to do anything.

00:04:40.200 | It doesn't need to go to any of these sub-agents.

00:04:42.160 | So it will say, okay, I'm just going to respond to the human directly.

00:04:46.380 | There's no need to go down and run anything else here.

00:04:50.620 | Now, this is what the orchestrator sub-agent pattern looks like.

00:04:55.040 | Everything is controlled by the orchestrator.

00:04:58.640 | The sub-agents, they are essentially used as tools for that orchestrator.

00:05:04.700 | The sub-agents do not respond to the user directly and they only do something when the orchestrator tells them to do so.

00:05:12.940 | So that is orchestrator sub-agent.

00:05:15.960 | Now, we're going to go ahead and actually build what we can see here.

00:05:19.900 | Okay, so we need to build this workflow.

00:05:23.340 | The first thing we're going to focus on is building these sub-agents.

00:05:26.580 | So we have a few agents here.

00:05:28.320 | We have the web search sub-agent, which you might have guessed is doing a web search for us.

00:05:34.000 | It has access to a web search tool.

00:05:35.660 | Then we have the internal docs sub-agent.

00:05:38.460 | This is almost like a RAG agent, and I will talk a little more about that soon.

00:05:44.260 | And then we also have the code execution agent, which I'm mostly restricting to doing calculations for us in this example.

00:05:51.920 | So let's go ahead and start looking at how we would build those in agents SDK.

00:05:56.520 | So starting with the web search sub-agent.

00:05:59.500 | The web search sub-agent will take a query from the orchestrator, which is in most cases going to be some form of the user's query, and use it to search a web.

00:06:09.480 | The agent is going to collect various sources.

00:06:12.940 | I think by default, there's like 10 different sources I will collect.

00:06:16.680 | It's going to collate the information from those sources and then generate a single text response and pass that back to the orchestrator.

00:06:25.700 | Now, I considered using OpenAI's built-in web search tool for this, but to be completely honest, it is absolutely terrible.

00:06:36.840 | So I'm not going to use it, and instead we're going to use another web search API called LinkUp.

00:06:43.440 | Now, LinkUp does require an account, but they give a certain amount of free credits when you sign up, and I think basically you have to use a lot of searches to use up those free credits.

00:06:56.500 | So it's more than enough for what we're doing here.

00:06:59.160 | So let's go ahead and just go through and create a LinkUp account.

00:07:03.700 | Clicking on this link here, and you should get sent through to a sign-up page, most likely.

00:07:09.680 | Once you have signed up, you will get to your homepage here.

00:07:13.140 | Your API key, you can just grab it here.

00:07:15.500 | Super easy.

00:07:16.360 | So I'm going to copy that.

00:07:17.980 | Switch back to my notebook.

00:07:22.780 | I'm going to run this cell, and I'm just going to enter my API key here.

00:07:27.440 | Okay, so now let's just test this quickly and see what we get.

00:07:32.240 | I'm going to search for the latest world news.

00:07:35.320 | So that is running, and we should get a response fairly quickly there.

00:07:40.220 | So, yeah, we can see there's a pretty big object here, but we can just pass it out into something a bit more usable here.

00:07:51.220 | Okay, so I'm just looking at the first three results here.

00:07:54.160 | But in reality, we have 10 results.

00:07:57.380 | Okay, so you see that we have the, we have like a title of the source.

00:08:02.980 | We have a link for that source, and then you also have like a, the content from that source as well.

00:08:09.160 | Now there is, so what I'm doing here is this standard search.

00:08:14.360 | There is also a deep search, which you would use if you want like really detailed results.

00:08:20.760 | But I just want quick results here, so I'm going with standard.

00:08:24.300 | Cool.

00:08:25.320 | So we have that, and well, that's basically all our tool is going to be doing, at least the web search tool.

00:08:32.080 | So I'm going to go ahead and create a function tool using this, the logic that we just put together.

00:08:38.860 | I am going to be using the async search method here, because in general, if I'm building an AI agent, I want to make sure everything is async.

00:08:50.900 | Especially those operations where you're waiting for an API because, okay, you're waiting for a response from some API request.

00:08:59.480 | In that time, if you're not using async, your program is just going to be sat there doing nothing.

00:09:04.820 | If you instead are writing this asynchronously, your program can go and do other things whilst you are waiting for that API request response.

00:09:16.480 | So as what we do here, nothing really changes, you just have to await and make sure you're using async search within an async function.

00:09:25.560 | Cool.

00:09:26.460 | So I actually want to remove this bit.

00:09:30.520 | I missed that before.

00:09:31.820 | And what we're doing here is we're going through those search results, parsing out in the format I showed you here,

00:09:39.040 | and just generating or building a single string from those results, which we then return.

00:09:47.800 | Now that we have that, let's go and define the agent that is going to be using this tool.

00:09:54.560 | And that is our web search agent.

00:09:56.780 | So we have these instructions for how the agent should behave, how it should use the web search tool.

00:10:04.660 | And these are pretty generic.

00:10:06.580 | There's nothing complicated here.

00:10:09.020 | The only thing that I am specifying is I'm telling it, okay, once it has this required information, which we get from the tool,

00:10:17.440 | once we have those results, summarize those with cleanly formatted links, sourcing each bit of information that it uses when it creates a summary.

00:10:26.680 | And I also ask it to use markdown formatting, because markdown formatting is just nicer to work with.

00:10:35.860 | And LMs are generally good at both reading and generating markdown.

00:10:41.180 | So it's a good standard, in my opinion.

00:10:45.020 | Now, we can talk to our agent and just confirm that this web search works.

00:10:53.960 | And of course, the agent itself.

00:10:56.100 | So I'm going to ask you, how is the weather in Tokyo?

00:10:58.020 | Okay.

00:10:59.820 | So I get the current weather in Tokyo is around 18 degrees Celsius or 64 degrees Fahrenheit, if you want to be difficult, with partly cloudy skies.

00:11:11.960 | Seems like a pretty nice day.

00:11:13.680 | And we also have the sources down here as well.

00:11:18.540 | And all of this, this is all in markdown.

00:11:21.420 | So that's why we have done this display in markdown.

00:11:23.700 | We could show the direct text.

00:11:27.580 | So if I show you this, this is what it actually looks like.

00:11:32.460 | Maybe I should print this.

00:11:34.220 | This is what it actually looks like.

00:11:37.820 | Okay.

00:11:38.200 | So we have that markdown formatting there.

00:11:39.960 | Great.

00:11:40.860 | So now let's move on to the next subagent, which is our internal docs subagent.

00:11:46.440 | Now, this is a very common use case, especially in corporate environments, but honestly, just in many, many places.

00:11:57.060 | So when I say internal docs subagent, what this agent is intended to do is given a set of private information, right?

00:12:08.420 | So that could be your own personal private information, or it could be the company that you work for.

00:12:15.100 | It could be all of their internal documentation, could be your team's wiki page, or something along those lines.

00:12:20.860 | Stuff that is not on the publicly accessible web, and therefore cannot be answered by a web search agent.

00:12:27.860 | For these types of documentation or information, we very commonly see people using RAG, which is Retrieval Augmented Generation.

00:12:37.380 | Now, RAG is an incredibly performant way of augmenting your LLM with external information, i.e. information that your LLM does not already know from its pre-training or fine-tuning.

00:12:55.500 | It's generally very cost-effective, it's fast, it's a really good approach.

00:13:00.740 | However, it does require a little bit of setup, despite being relatively simple.

00:13:05.160 | So we're not going to go through and build an entire RAG pipeline in this example, but instead, I'm just going to create almost a dummy RAG tool, which is going to return a specific document for us.

00:13:19.080 | So, that document discusses the revenue figures for a wildly successful AI robotics company that we have set up.

00:13:29.460 | That company is called Skynet, and we have the revenue report here that you can read.

00:13:34.580 | So this is a separate markdown file.

00:13:37.640 | And you're seeing here, we've included specific bits of information that only our internal DOPS sub-agent will be able to give us.

00:13:47.780 | So we're specifying that this is the Q1 revenue report for 2025, that it was released on April 2nd, 2025.

00:13:58.660 | And we have a small little executive summary, just tells us, okay, what is Skynet doing?

00:14:04.680 | Lots of lovely things.

00:14:06.100 | And it gives us a financial overview.

00:14:08.800 | So a little table that our LM is going to be able to read.

00:14:11.640 | It tells us just, okay, what products do we have?

00:14:16.580 | The, like, the revenue from those various little bits of information, okay?

00:14:21.000 | And then we have some revenue insights, some forward guidance, okay?

00:14:25.900 | So it's like a nice, really simple revenue report.

00:14:30.020 | Now, if you are running this in Colab, you should download that document and put it in your DOPS for Colab.

00:14:39.440 | Alternatively, you can actually download it directly from the repo.

00:14:43.360 | And what I will do is actually make sure I share a little snippet of code to do that so that you can just pull it directly from the repo.

00:14:50.660 | Or if you're running this locally and you clone the repo, this will work as it is.

00:14:56.280 | So I'm going to run this.

00:14:57.360 | This is just going to be just loading in now our revenue report.

00:15:01.860 | And this becomes our RAG tool, okay?

00:15:04.500 | It's a fake RAG tool, just making that very clear.

00:15:07.660 | But it is our RAG type tool.

00:15:11.420 | So what it's going to do is when the LLM provides a query to search with, it is basically going to ignore that query because this is a fake search function.

00:15:22.320 | And it is going to just return that one document, that financial report that I just showed you.

00:15:29.260 | So now what I want to do is we have our tool, our fake tool, and we're going to define our internal docs subagent.

00:15:40.440 | Now the internal docs subagent, again, a very simple instructions here, nothing complicated at all.

00:15:46.900 | I'm just saying you have access to internal company documents.

00:15:50.780 | Once the user asks you questions about the company and you will use the provided internal docs to answer the question.

00:15:56.040 | Ensure you answer the question accurately and use markdown formatting.

00:15:59.120 | Similar instructions as to what we use with the other subagent.

00:16:03.820 | So, yeah, we have that and we can test it.

00:16:07.260 | So I'm going to ask you, what was our revenue in quarter one?

00:16:11.180 | And you can see, okay, we have all these.

00:16:14.160 | And you can see that's given as a breakdown of each of the various units and how much revenue they provided.

00:16:22.360 | And this is probably where that code execution subagent would be really useful.

00:16:28.260 | Because it can actually take all these, put them together.

00:16:31.260 | Although at the same time, I believe these numbers are not difficult to work with.

00:16:37.120 | So our LLM could probably put those together by itself.

00:16:40.480 | However, LLMs doing calculations is generally a bad idea.

00:16:46.140 | They just hallucinate quite frequently, even for some relatively simple.

00:16:52.240 | Calculation.

00:16:53.680 | So it's always better to get your LLM to write code where it would calculate that it can use to actually calculate the, well, any calculation.

00:17:08.580 | But that all looks good.

00:17:10.020 | So let's move on to that code execution subagent.

00:17:15.780 | Now, in our code execution subagent, for this example, we're focusing on relatively simple calculations.

00:17:23.220 | That is what we want the code execution agent to do here.

00:17:28.100 | But especially with more state-of-the-art LLMs, a code execution agent could write pretty good code for a lot of different use cases.

00:17:41.120 | I'm sure most of us are pretty familiar by now with AI code editors.

00:17:45.320 | So I don't think I need to explain how good LLMs can be, especially on specific tasks.

00:17:54.420 | Which is, of course, what we'd be using the LLM for here.

00:17:57.680 | But I do like to be careful with what I'm giving an LLM when it comes to code execution, especially within a chat interface.

00:18:09.840 | So with that in mind, this is the tool that our code execution agent will be using.

00:18:16.280 | You see that I've also explained to the, so if you provide a dark string to your function tool, that will tell the LLM what this tool should be used for and also how to use the tool.

00:18:31.020 | And I've specifically told it here that the output must be assigned to a variable called result.

00:18:37.380 | You can, you can change this like output or something else.

00:18:41.840 | But I believe that at least for OpenAI models, they've actually been trained to output code execution results to something, a variable called result.

00:18:52.820 | So I would stick with that.

00:18:56.280 | You can change this to output, but what I saw when doing this is that this subagent would very frequently run this tool and it would write it as if it is writing to a variable called output, even though I prompted it not to.

00:19:11.840 | And then see that it fails and then do it again and then get it right.

00:19:16.840 | But why retry if we can just get it right the first time?

00:19:20.460 | So I would recommend doing this.

00:19:23.660 | So in here, I am saying, okay, here I'm just sharing, okay, what is the code that is going to create to you?

00:19:29.380 | This is just for us to read it.

00:19:31.100 | We don't actually need this.

00:19:32.140 | We can remove it if we wanted to, but it's just for us to understand what is going on.

00:19:35.720 | Then I'm saying, you know, because code execution can be, it can work, it cannot work.

00:19:41.920 | It varies.

00:19:42.880 | I am putting this code execution within a tricep block.

00:19:48.320 | And then within this tricep block, I'm setting the global variables for the execution scope that we're running here.

00:19:58.940 | Okay.

00:19:59.580 | So what this is doing is it's basically ensuring we're not running our code with any variables are either coming from this environment or some other environment, depending on where we're running this.

00:20:13.760 | Okay.

00:20:14.180 | So we're making sure there are just no variables that already exist within this execution context.

00:20:22.880 | Then what is going to happen is this code is going to run inside this empty namespace.

00:20:30.480 | And then the empty namespace will gather all the variables that have been created within that code execution.

00:20:35.480 | So we can actually get the result by accessing it like this.

00:20:41.560 | Okay.

00:20:42.060 | Which is pretty cool.

00:20:43.900 | So that is how I execute code tool.

00:20:48.620 | You will also notice here that you can see here, this is just a normal synchronous function.

00:20:54.600 | It's not an async function.

00:20:55.860 | The reason for that is that there's nothing in here that requires network calls or there's nothing here where our code is likely to be waiting.

00:21:07.700 | Right.

00:21:08.520 | So with an API request, you are sending that request.

00:21:11.380 | You're waiting.

00:21:11.960 | In this case, everything is being run right here.

00:21:17.000 | So unless something within this code here causes a network request, there's not really any need, in my opinion, to make this async.

00:21:30.400 | So that's why it's not.

00:21:34.600 | Now we define our code execution sub-agent.

00:21:37.240 | You can see here that I am using GPT 4.1 or 4.1 Mini.

00:21:42.320 | You can use Mini.

00:21:44.220 | To be honest, it doesn't really matter for this example.

00:21:47.000 | But I just want to show you that, one, you can use various LMs in different parts of your multi-agent setup, of course.

00:21:56.060 | That's one of the benefits of multi-agent setups.

00:21:59.660 | And two, I just want to be extra careful because we're executing code.

00:22:05.340 | So I want, ideally, the best LM that is reasonably priced and reasonably fast for code execution.

00:22:14.720 | So that's why I've gone with 4.1 rather than Mini here.

00:22:18.280 | Then we just have the description.

00:22:21.460 | There's, again, nothing new or weird there.

00:22:27.300 | So we can run that and then we can test our sub-agent.

00:22:31.380 | And I'm just going to ask you a nonsense question.

00:22:33.760 | But it's a nonsense question that you can apply math to.

00:22:37.140 | So let's see what it comes up with.

00:22:38.740 | Okay.

00:22:42.720 | So it's telling me this is what we're printing out from the function.

00:22:46.280 | So the code date is going to execute.

00:22:48.560 | It's got number of apples.

00:22:50.060 | It has a number of bananas.

00:22:51.140 | This is the question I asked.

00:22:53.440 | Four apples, multiply them by 71 and 1 tenth batanas.

00:22:57.900 | So it has, it got that right.

00:23:01.260 | It has a number of apples right.

00:23:02.660 | And then just multiplying those together and it stores that result here.

00:23:06.600 | Well, yeah.

00:23:07.700 | And then we are, of course, extracting that result out here.

00:23:12.900 | Okay.

00:23:13.380 | That information or that result gets returned to our LLM.

00:23:17.360 | And you can see that this is what we get from that.

00:23:20.780 | So it's 284.4.

00:23:22.620 | And because we're using LLM, it is telling us that we're not being very sensible here.

00:23:30.180 | And it says, okay, the result is a mathematical product, but in real life, you can't multiply

00:23:34.920 | apples by bananas.

00:23:36.000 | Okay.

00:23:37.040 | So nice little bit of telling us we're not being logical there as well.

00:23:43.580 | So now we have our three subagents and we can move on to defining our orchestrator.

00:23:48.860 | So as I mentioned a little bit earlier, the orchestrator is what is going to be controlling

00:23:53.980 | the inputs and outputs throughout our entire workflow.

00:23:57.800 | And the way that we can think of our subagents in this system is actually as tools.

00:24:04.040 | And in fact, the way that we implement our orchestrator connected to all these subagents

00:24:11.560 | is by turning those subagents into tools and then passing them into the tools parameter of

00:24:18.340 | our orchestrator.

00:24:19.200 | So let's see what that looks like.

00:24:22.200 | So I'm going to come down to here.

00:24:23.720 | We have, I'm defining another tool here, by the way, just we'll see later why, but I

00:24:30.980 | also want to just show you that we can use tools as well as agents as tools here.

00:24:35.740 | So this is our orchestrator definition.

00:24:39.240 | It is just an agent in the same way that we earlier defined our subagents.

00:24:44.700 | The main difference is one, its name is orchestrator.

00:24:48.500 | Two, the model may be different.

00:24:51.680 | In this case, I'm thinking, okay, I want this to be the reasoning engine like this is, I want

00:24:58.440 | this to be a good LM that is powering the orchestrator.

00:25:01.720 | To be completely honest, you probably don't need it to be 4.1.

00:25:06.680 | You go for mini.

00:25:07.780 | But that is really up to you and your use case and what you need from it.

00:25:14.940 | Then what we do is we take our agents.

00:25:18.480 | These are the subagents that we have defined.

00:25:20.480 | And we use this as tool method to turn them into tools.

00:25:24.400 | Now, when we use this as tool method, we also need to provide a tool name.

00:25:30.180 | And also a tool description.

00:25:31.380 | Okay.

00:25:32.020 | So this doesn't need to be anything complicated is just, okay, this is our tool name.

00:25:38.800 | I'm using the function name and it's worth noting that you cannot include white space.

00:25:44.180 | So I could not do this, for example.

00:25:47.880 | So it has to, if you want spaces, you can't, you need underscores.

00:25:53.440 | Then, yeah, you're just giving us tool description.

00:25:56.660 | Okay.

00:25:57.160 | So this is telling your orchestrator when should it use this agent as a tool.

00:26:01.000 | Okay.

00:26:01.780 | So we have those tools and then we also have an actual tool.

00:26:04.760 | Okay.

00:26:05.240 | So this get current date.

00:26:06.520 | Get current date is literally just a tool to get the current date solid.

00:26:10.920 | So that's what we have.

00:26:12.160 | We also have the orchestrator prompt that is just above.

00:26:15.840 | Let me show you that very quickly.

00:26:17.080 | So what I'm trying to do in this orchestrator prompt is give the orchestrator LLM context as

00:26:24.940 | to where it is.

00:26:26.540 | Okay.

00:26:27.140 | It needs to know in what type of system is it in.

00:26:32.380 | And the reason we do that is if it knows what sort of system it is in, it will better understand,

00:26:39.860 | okay, why am I calling this agent tool?

00:26:43.040 | And, you know, what are all these things that are around me?

00:26:46.860 | Right.

00:26:47.380 | So we're just giving it context so that it can operate better.

00:26:50.400 | We also tell it, okay, you're in the system and this is how you should operate.

00:26:56.900 | So what we're saying are you take the user's queries and pass them to the appropriate agent tools.

00:27:01.920 | The agent tools will see the input you provide and use it to get all the information that you

00:27:06.720 | need to answer the user's query.

00:27:08.580 | We also want to say, like in my earlier example, where we were using the web search tool followed

00:27:17.200 | by the calculator tool, we also want to explicitly tell the orchestrator that it can call multiple

00:27:25.920 | agents, okay, to get all the information it needs.

00:27:28.980 | Then at the very end here, one thing that we just want to be very clear with, with the LLM,

00:27:36.420 | is that it shouldn't be drawing attention to the fact that this is a multi-agent system.

00:27:41.080 | In some cases, maybe you would want that, but in this case, I want to build a conversational

00:27:47.220 | interface.

00:27:48.300 | I want users to come in and talk and all they really see is, okay, there's some chat interface

00:27:54.720 | here and I'm just talking and I have no idea what is really behind the scenes.

00:27:59.040 | Maybe there's some information like, oh, I've got this from the web, some, you know, there's a source

00:28:03.960 | or I got this from the internal documents that I have access to.

00:28:08.040 | Maybe we want a little bit of that, but I don't really want the user to be being told by our

00:28:14.320 | orchestrator, hey, I just need to go and use the internal doc sub-agent because that's where I find

00:28:22.160 | my internal information from.

00:28:24.220 | I don't really want it to go into that much depth as to what it's doing.

00:28:29.440 | I just want it to be conversational.

00:28:30.780 | So that is why we use this last sentence of do not mention or draw attention to the fact that this

00:28:40.080 | is a multi-agent system in your conversation with the user.

00:28:44.380 | So that is our prompt.

00:28:45.600 | This is our orchestrator.

00:28:47.580 | And now if I run both, we can go ahead and just test our agent.

00:28:55.300 | So I'm going to say first, how long ago from today was it when we got our last revenue report?

00:29:03.120 | So there are a couple of things that need to happen here.

00:29:05.700 | How long ago from today?

00:29:07.380 | So the orchestrator is going to need to find out, okay, what is the current day, which it

00:29:15.080 | can do using that get current date tool.

00:29:18.780 | Then once it has the current date, the orchestrator needs to find out when the last revenue report

00:29:26.940 | was released.

00:29:27.700 | Now, if it is being prompted, well, the agent should understand, okay, although we didn't

00:29:37.380 | explicitly say that this is for the company we belong to.

00:29:42.000 | If we're talking to this internal company agent that has access to these internal company

00:29:48.580 | documents, probably the user is asking about that specific company and not just when were

00:29:57.080 | the last revenue reports in the entire world released.

00:30:00.220 | So hopefully we should see that it doesn't use the web search tool to find whenever the

00:30:05.160 | last revenue report in the entire world was released.

00:30:07.740 | But instead, it should go into that internal DOPS subagent and get the information from

00:30:14.860 | there.

00:30:15.140 | So let's run that and we'll see what happens.

00:30:17.480 | Now, one thing that is, it's kind of hard to see here.

00:30:22.940 | We don't know what is going on, right?

00:30:25.900 | I can see it's running.

00:30:26.760 | Okay.

00:30:27.660 | And it is showing that, okay, today is May 7, 2025.

00:30:34.360 | It is saying the last revenue report was for the quarter ending May 31st, 2025.

00:30:40.400 | So that is correct because it's quarter one.

00:30:43.480 | It doesn't pick up on the April 2nd, but that might just be due to my question not being specific

00:30:50.940 | enough on when was the last revenue report release or versus when was the end date for that last

00:30:57.020 | revenue report.

00:30:59.360 | But this is an accurate answer.

00:31:02.420 | We can see it's using the correct tools and the correct information from various places.

00:31:06.840 | But I don't actually know that that is the case, meaning that I know this information is

00:31:15.220 | accurate.

00:31:15.660 | I know this information is coming from somewhere.

00:31:18.320 | But how do I confirm that?

00:31:20.240 | And this is particularly important when we have more complex agents, where there's information

00:31:24.600 | coming from many different places.

00:31:26.340 | And we as developers might not necessarily know what all of the correct information is.

00:31:32.180 | So what we can do in this scenario to have more insight into what has just happened, we

00:31:39.180 | can go to Tracer's dashboard in the OpenAI platform.

00:31:42.780 | So to do that, I'm going to go to platform, openai.com.

00:31:48.000 | I will make sure I log in.

00:31:50.240 | Then I need to make sure I'm in the correct project.

00:31:52.740 | Okay, so I'm using, I am in the correct project.

00:31:55.340 | Here's the advocacy project.

00:31:58.120 | And you will need to go to dashboard on the right here.

00:32:03.340 | And you want to go to traces.

00:32:05.140 | Now, traces, hopefully you can see them.

00:32:10.480 | If you're in a company and you're accessing the company's traces, there's a fairly good

00:32:15.060 | chance that maybe you can't see anything here.

00:32:16.780 | You can't see that traces dashboard.

00:32:19.520 | The reason for that is that the company administrator or owner needs to go into here and give and

00:32:26.860 | set the permissions for you to actually see their traces or logs dashboard, which they can

00:32:34.900 | do by going over to the settings, organization settings, data controls, and making sure that

00:32:40.800 | the logs here, which includes the traces, is visible, either for selected projects to everyone

00:32:48.000 | or something else that works and makes this visible for you.

00:32:51.640 | So once you can see everything in your dashboard or traces dashboard specifically, you can go

00:32:59.280 | to your most recent trace, which should ideally be the one that you just ran.

00:33:04.240 | And we can see, okay, we have agent workflow.

00:33:06.240 | There were no handoffs.

00:33:07.480 | That's good because we didn't build handoffs into our workflow.

00:33:10.300 | We can see the number of tools that we use, which is three.

00:33:13.760 | Interesting.

00:33:15.860 | And we can see the execution time was 13.73 seconds, which is long, but that is, we did

00:33:22.600 | use an extra tool or one more tool than we needed here.

00:33:25.140 | So let's go into this and see why that happened or at least have an idea of why that happened.

00:33:33.020 | Okay.

00:33:33.380 | So we can see in here, we went in, so we had the orchestrator, started here.

00:33:39.720 | Then we went to this web search agent, and this took the majority of the time.

00:33:44.360 | It's like nine seconds, which is pretty long.

00:33:48.480 | And if we look at that, we can see that we had a post to, this is OpenAI, V1 responses, the responses API.

00:33:59.520 | So this is the LLM generating something.

00:34:02.040 | And we can see that the input here, this is coming from the orchestrator, not actually the user.

00:34:08.800 | So the orchestrator is providing that date of last revenue report, and the LLM, based on this user message, has gone and decided, okay, we need to use the search web tool.

00:34:20.480 | And we're going to provide it with this query, which is last revenue report date, okay?

00:34:25.140 | And this is, okay, we can see straight away, this has gone to the search web tool.

00:34:30.180 | So this is like, okay, we need to prompt a little better here in order to make it clearer to this agent that, or to our orchestrator agent, that this is a agent that might be used by a particular company.

00:34:42.760 | And usually company questions or revenue questions and so on would be about the internal dots rather than the web, okay?

00:34:51.700 | And that will also explain why it told us that the revenue date was the 31st of May, okay?

00:34:56.960 | So this is really good for debugging.

00:34:58.360 | So, okay, we can see that was sent to the search web tool.

00:35:03.520 | We got these results.

00:35:05.340 | And based on that, it's saying, okay, this is the current date.

00:35:08.580 | I know quarter one ends on the 31st of March.

00:35:12.440 | So it calculates how long ago that was after looking at the get current date tool here, okay?

00:35:20.600 | So it called the get current date tool.

00:35:22.940 | And this is outside the web search agent, sorry.

00:35:25.260 | So web search agent returned to the orchestrator.

00:35:29.200 | The orchestrator got this here.

00:35:31.420 | Date of, this is coming from the web search agent, this output here.

00:35:36.140 | Then the orchestrator decided, okay, I need to use the get current date tool.

00:35:41.780 | It got that, got the output, and then it generated our final response.

00:35:46.700 | So it did a lot of things.

00:35:48.240 | So we need to print it a little better, more likely than not, at least.

00:35:52.220 | So what we could do here is say, okay, you're orchestrated a multi-agent system.

00:35:58.480 | We'll just add a little bit here.

00:36:00.480 | Note that you are an assistant for the Skynet company.

00:36:11.760 | If the user asks about company information, information, or finances,

00:36:24.500 | you should use our internal information rather than public information.

00:36:38.760 | And this should be enough to guide our observator in a bit of a better direction.

00:36:44.120 | So let's just try again and see what happens.

00:36:46.060 | Okay, and now we can see it is actually getting that right.

00:36:49.980 | Also much faster.

00:36:51.020 | So if we come over to our traces again.

00:36:58.520 | This is our latest run, 8.5 seconds, much faster.

00:37:03.020 | And we can see that it went to the internal docs agent.

00:37:05.960 | It searched, got the response from the tool.

00:37:11.340 | And then this is the call from the LLM back up to the orchestrator.

00:37:18.880 | So the, what did it give us?

00:37:21.020 | It said latest revenue report, Skynet Inc.

00:37:23.100 | Is this dated April 2nd, 2025.

00:37:26.060 | So this is what the internal docs agent or subagent provided back to the orchestrator.

00:37:31.740 | Great.

00:37:32.460 | So that is good.

00:37:34.040 | The final response, which would be from here, is actually this.

00:37:40.240 | Okay, exactly the same as what we have in the notebook.

00:37:42.420 | Great.

00:37:43.500 | So that is one test.

00:37:45.340 | Let's try another one with our orchestrator.

00:37:47.320 | I'm going to say what is our current revenue and what percentage of revenue comes from the T1000 units.

00:37:55.440 | Okay.

00:37:57.680 | We'll see what tools or what agents, subagents, sorry, it decides to use this time.

00:38:03.980 | Okay.

00:38:06.660 | So 11.2 seconds runtime.

00:38:09.680 | It looks roughly accurate.

00:38:13.540 | Let's switch across to our traces and see what happened.

00:38:16.720 | So use the internal docs agent.

00:38:20.880 | You see that I actually tried to use this internal docs agent twice.

00:38:24.460 | Probably it's trying to find some information that is not within like the dummy tool.

00:38:28.480 | So it's like trying again and then realize, it probably realizes, oh, okay, this is useless.

00:38:33.540 | I'm not getting any more information.

00:38:35.140 | So then it gives up.

00:38:36.500 | But yeah, we got the current revenue.

00:38:40.960 | And from that, we got this information here.

00:38:43.080 | Then it decided it wants to find the percentage of revenue.

00:38:47.800 | So it's trying to, rather than calculate itself, it's actually trying to search through our internal docs.

00:38:54.320 | to get that information.

00:38:55.660 | Okay.

00:38:57.060 | Which obviously slowed it down a little bit there.

00:38:59.800 | And then in the end, it actually did not, it did not try to use the calculator.

00:39:07.180 | Okay.

00:39:07.960 | But what did it come up with?

00:39:10.840 | Let's see.

00:39:11.540 | Okay.

00:39:13.800 | It came up with this for the response, which is, I believe it's accurate anyway.

00:39:21.240 | So it could be better.

00:39:22.500 | It could have used the code execution sub-agent.

00:39:25.280 | And again, this is something where we probably want to prompt it a little better.

00:39:28.920 | But we got our answer.

00:39:31.020 | So that is actually all I wanted to go through on the orchestrator sub-agent, multi-agent workflow

00:39:39.220 | with Agents SDK.

00:39:41.240 | As we've seen, there is a lot you can do with this, of course.

00:39:45.400 | What I just showed you is so a relatively simple pattern.

00:39:50.800 | There wasn't a whole lot going on.

00:39:52.380 | There is definitely an argument to be made that, do we need sub-agents for those simple tasks?

00:39:59.380 | In this case, probably, potentially not, depending on what you're looking to do.

00:40:06.160 | I would say maybe for code execution, you should use a separate sub-agent.

00:40:09.200 | For the other ones, it depends, right?

00:40:12.240 | If you actually have an internal docs use case, it might benefit you to have that sub-agent

00:40:19.720 | because then you can prompt that sub-agent with additional context and information about

00:40:25.520 | how to use that internal docs tool, which can be really useful in just getting better results.

00:40:33.080 | And the same is also true for the web search sub-agent.

00:40:36.080 | You can prompt it and give it more information about how to get the best results from your web search tool.

00:40:42.160 | So it really depends on what you're looking to do, how important latency is.

00:40:48.480 | One thing to be aware of with the orchestrator sub-agent pattern is everything is going through

00:40:55.540 | your orchestrator, okay?

00:40:57.380 | Even the responses from the sub-agents.

00:40:59.320 | So in the scenario that you only need a single sub-agent to be used, let's say for a web search,

00:41:06.480 | the orchestrator sub-agent is not ideal because your user query goes from your user to the orchestrator.

00:41:15.180 | The orchestrator then needs to decide to use the web search sub-agent, which is then another

00:41:22.100 | LLM itself, right?

00:41:23.660 | And then that other LLM is going to create the web search tool call, get that response.

00:41:30.780 | That LLM is going to generate another response, send it to your orchestrator, and then the orchestrator

00:41:36.700 | is going to generate yet another response and send that back to the person.

00:41:42.240 | It can be very slow, right?

00:41:44.980 | We just had four different LLM generation steps.

00:41:49.160 | Whereas if it was the orchestrator, let's call it main agent in this scenario, going directly

00:41:55.740 | to a web search tool, it would be orchestrator generates the tool call that goes to your web

00:42:04.000 | search tool.

00:42:05.660 | The web search tool returns its response to the orchestrator.

00:42:08.680 | The orchestrator generates a response based on information and sends it back to the user,

00:42:13.840 | which is just two LLM calls.

00:42:15.840 | Naturally, given that LLM calls tend to make up the bulk of our waiting time or our latency,

00:42:25.860 | naturally, the orchestrator sub-agent pattern in these scenarios where we're just expecting

00:42:32.260 | a single tool use is not ideal.

00:42:35.320 | However, in this scenario where you do need these sub-agents because they just handle particular

00:42:42.000 | tasks better than you can do with a generic all-purpose agent, in that scenario, and for queries

00:42:49.480 | that require more than a single sub-agent to be used, the orchestrator sub-agent pattern

00:42:55.220 | is almost essential.

00:42:57.500 | The other option that you might consider is where you have an orchestrator with sub-agents,

00:43:04.320 | but then sub-agents can respond directly to the user.

00:43:07.880 | In that scenario, again, it becomes more difficult to use multiple sub-agents.

00:43:14.540 | You could have the sub-agent look at the information that has been provided and decide whether to

00:43:19.320 | respond directly to the user or the orchestrator.

00:43:21.540 | That is completely possible, but you need to prompt it well and make sure all that is going

00:43:27.060 | to work.

00:43:27.900 | So, this pattern can be good, but you do need to be careful with the latency, and it is generally

00:43:33.880 | better for those cases where latency is not super, super important.

00:43:39.720 | Although, that being said, you can still make it conversational.

00:43:43.160 | You just need to be smart about how many tokens you're using, what tools you're using, and which

00:43:50.700 | models you're using, of course, as well.

00:43:52.500 | Smaller models are faster.

00:43:55.480 | So, that is all I wanted to cover in this video.

00:43:59.900 | So, thank you very much for watching.

00:44:01.920 | I hope all this has been useful and interesting, but for now, I'll leave it there.

00:44:05.780 | So, I will see you again in the next one.

00:44:08.460 | Bye.

00:44:09.080 | Bye.

00:44:09.420 | Bye.

00:44:09.580 | Bye.

00:44:10.220 | Bye.

00:44:11.220 | Bye.

00:44:11.860 | Bye.

00:44:12.500 | Bye.

00:44:13.100 | Bye.

00:44:13.140 | Bye.

00:44:15.140 | Bye.

00:44:15.780 | Bye.

00:44:16.420 | Bye.

00:44:16.740 | Bye.

00:44:17.380 | Bye.

00:44:17.400 | Bye.

00:44:18.040 | Bye.

00:44:18.460 | Bye.

00:44:19.100 | Bye.

00:44:21.100 | you

00:44:23.160 | you

Multi-Agent Systems in OpenAI's Agents SDK | Full Tutorial

Chapters