back to indexJust do it. (let your tools think for themselves) - Robert Chandler

Chapters
0:0 Introduction
1:40 Context pollution
2:18 Authentication
2:56 Agency
3:35 MCP Toolbox
6:15 Summary
00:00:03.560 |
And at Wordware, I've personally helped hundreds of teams 00:00:08.280 |
I'm here to share a few of the insights that we got, 00:00:22.380 |
And really, building high reliable systems is in my blood. 00:00:30.360 |
that can take action in the real world on your behalf. 00:00:32.720 |
They have all the context they need about you and your team. 00:00:36.440 |
And they have the ability to actually interact 00:00:38.400 |
with the tools you use and output kind of data 00:00:42.120 |
Unfortunately, most of the time they don't really work. 00:00:45.860 |
They're often slow, expensive, and unreliable. 00:00:50.320 |
I remember an example when MCP first came out 00:00:52.540 |
and we hooked up Slack and it spent a bunch of time, 00:00:56.240 |
you know, I just want to send a Slack message to Philip, 00:01:00.820 |
Unfortunately, it then like listed all the users 00:01:05.940 |
tried listing all the channels, tried sending a message, 00:01:22.820 |
And the real problem is that these MCPs are often low level wrappers 00:01:26.280 |
around these APIs that were not designed for language models. 00:01:29.780 |
You know, you get these messy responses that have huge blobs of JSON, 00:01:33.200 |
which are great for like deterministic state machines, 00:01:35.780 |
but kind of suck in a kind of context pollution for agents. 00:01:40.980 |
You get tools that are these tiny scope, you know. 00:01:44.020 |
Most of the MCP tools are just a wrap around a function, 00:01:47.220 |
and functions were designed for the programmatic world, 00:01:49.320 |
where you want to compose a lot of these tasks together 00:01:55.220 |
That's really hard for an LLM to continue reasoning over multiple calls 00:01:58.420 |
and like polluting their context with all different outputs. 00:02:01.300 |
It's also a problem when you've got multi-call pagination. 00:02:04.120 |
You know, when the API responds and you need to kind of loop over the results 00:02:11.420 |
but it also means that the LLM has to reason over more and more longer chains of requests. 00:02:20.380 |
You know, it's got a little bit easier with these hosted MCPs, 00:02:22.460 |
but still, a lot of the time, you need to have your own API keys. 00:02:26.380 |
You need to be like modifying, like creating bots and things. 00:02:30.300 |
I'm sure that will go away over the next few months, 00:02:33.580 |
And yeah, just in general, the agents struggle when there's many tools 00:02:40.220 |
It's really hard, you know, every tool you adds more noise to the context window. 00:02:44.300 |
A lot of instructions, even just adding a Slack MTP adds eight different tools. 00:02:47.420 |
If you add Notion, you add another like 20 different tools. 00:02:50.300 |
And those two together, you can do a lot, but it's not like the be-all and end-all of automation. 00:02:58.220 |
Well, in my opinion, we add more agency to the tools. 00:03:02.140 |
Rather than making these tools very small, think a bit like, you know, 00:03:06.060 |
a T-Rex holding a little tiny spanner or like inspect a gadget with like a thousand different tools. 00:03:13.180 |
Think of it a bit more like a team of Avengers where, you know, you've got specialized people for different tasks. 00:03:21.820 |
You've got the Hulk eye to fire the arrow off and really do high precision tasks. 00:03:28.940 |
And, you know, obviously we all love Iron Man. 00:03:31.340 |
He's the best and he's just pretty good at a lot of things. 00:03:35.660 |
I'm not sure where this analogy is going, but I'm sure it's an entertaining one. 00:03:38.780 |
But really what we want to do is blur the line between what's a tool and what's an agent. 00:03:44.060 |
When is an agent just a tool for another agent? 00:03:47.820 |
And, you know, give tidy, simple, natural language APIs to these agents such that they get reliable, reusable, high quality outputs. 00:03:57.740 |
What I'm going to do is I'm going to demonstrate Wordware's new MCP toolbox. 00:04:04.780 |
You can turn your Wordware workflows into tools for your agents. 00:04:08.860 |
And so I'm just going to grab one from the landing page as an example. 00:04:14.220 |
And I'm picking this kind of competitor analysis because that's a flow that requires quite a lot of taste, quite a lot of reasoning, and also integration into both Twitter and Notion. 00:04:24.900 |
Rather than, you know, finding a Twitter MCP, I just use the kind of Twitter scrape tool, but went to Wordware. 00:04:31.660 |
And then I've described what I really want from my competitor analysis. 00:04:35.780 |
It's not just a generic whatever the LLM thinks. 00:04:38.800 |
It's kind of gone into detail about what I care about. 00:04:41.520 |
And I could add even more details about my company and try and work out, you know, where do we differ? 00:04:46.160 |
It then creates this analysis, writes the output to Notion, and then returns the URL in the output. 00:04:56.400 |
So still in the early days, but yeah, we are rolling this out beyond beta fairly soon. 00:05:05.280 |
I just added the competitor analysis after publishing this app earlier. 00:05:10.640 |
And now I can use this tool inside my Claude. 00:05:16.660 |
And what's nice about Wordware is you can add multiple tools into this toolbox. 00:05:19.340 |
So you can have a bunch of different tools that are grouped together that are all related or entirely disparate. 00:05:24.080 |
But you can switch on and off different toolboxes for different tasks. 00:05:26.640 |
But maybe let's do something like create a competitor analysis for anaerobic AI. 00:05:33.180 |
I hit this and now you can see it's going to use the Wordware tool. 00:05:39.820 |
You know, there's nothing too bad that can go wrong here. 00:05:43.600 |
And now it's going to form this competitor analysis. 00:05:54.480 |
I can grab the link to the Notion page on the competitor analysis. 00:05:58.720 |
And we'll see a nicely formatted summary based on all the tweets from Anthropic. 00:06:03.980 |
And we can see they care a lot about how they're tweeting. 00:06:08.520 |
And it's all in my Notion page, nicely formatted. 00:06:16.700 |
We managed to build a highly reliable, highly repeatable, and highly aligned tool that allows our generic agent to be very specific and very powerful for doing that task that we wanted it to do. 00:06:28.140 |
And so we've really blurred the line between what's an agent and what's a tool and allowed our agent to offload tasks to something that's more powerful. 00:06:35.260 |
Exactly how, you know, we do this already in teams and you have specialists for people, you know, whether it's the Avengers or your team in a company. 00:06:42.360 |
You can use Webware Toolbox to build these flows. 00:06:46.540 |
We hope you follow this pattern and give your tools time to think.