Just do it. (let your tools think for themselves)

- Hi, I'm Robert. I'm the co-founder and CTO at Wordware. And at Wordware, I've personally helped hundreds of teams build reliable AI agents. I'm here to share a few of the insights that we got, especially when it comes to tools. Really, agentic MCPs. Giving your tools time to think.

Before I worked on LLMs and agents, I used to work on self-driving cars. And really, building high reliable systems is in my blood. So yeah, here we go. The promise of agents are automated systems that can take action in the real world on your behalf. They have all the context they need about you and your team.

And they have the ability to actually interact with the tools you use and output kind of data where you need it. Unfortunately, most of the time they don't really work. They're often slow, expensive, and unreliable. I remember an example when MCP first came out and we hooked up Slack and it spent a bunch of time, you know, I just want to send a Slack message to Philip, it being like, hey, I'm using MCP.

It's super cool. Unfortunately, it then like listed all the users in the Slack channel, got confused, tried listing all the channels, tried sending a message, actually finally found Philip. It ended up resorting to sending a message in the general channel being like, hey, could someone tell Philip MCP is awesome?

Which I thought was kind of amusing, but also really not what I wanted as a user. It also took about five minutes to do that. And the real problem is that these MCPs are often low level wrappers around these APIs that were not designed for language models. You know, you get these messy responses that have huge blobs of JSON, which are great for like deterministic state machines, but kind of suck in a kind of context pollution for agents.

You get tools that are these tiny scope, you know. Most of the MCP tools are just a wrap around a function, and functions were designed for the programmatic world, where you want to compose a lot of these tasks together into like sequences of function calls. That's really hard for an LLM to continue reasoning over multiple calls and like polluting their context with all different outputs.

It's also a problem when you've got multi-call pagination. You know, when the API responds and you need to kind of loop over the results until you get the data you're looking for. This really pollutes the context window, but it also means that the LLM has to reason over more and more longer chains of requests.

Authentication is a pain. You know, it's got a little bit easier with these hosted MCPs, but still, a lot of the time, you need to have your own API keys. You need to be like modifying, like creating bots and things. I'm sure that will go away over the next few months, but right now it's a bit of a pain.

And yeah, just in general, the agents struggle when there's many tools or kind of sequences of tools to perform. It's really hard, you know, every tool you adds more noise to the context window. A lot of instructions, even just adding a Slack MTP adds eight different tools. If you add Notion, you add another like 20 different tools.

And those two together, you can do a lot, but it's not like the be-all and end-all of automation. So how do we solve this? Well, in my opinion, we add more agency to the tools. Rather than making these tools very small, think a bit like, you know, a T-Rex holding a little tiny spanner or like inspect a gadget with like a thousand different tools.

Think of it a bit more like a team of Avengers where, you know, you've got specialized people for different tasks. You know, you've got the Hulk to smash. You've got the Hulk eye to fire the arrow off and really do high precision tasks. And, you know, obviously we all love Iron Man.

He's the best and he's just pretty good at a lot of things. Maybe that's the main agent. Who knows? I'm not sure where this analogy is going, but I'm sure it's an entertaining one. But really what we want to do is blur the line between what's a tool and what's an agent.

When is an agent just a tool for another agent? And, you know, give tidy, simple, natural language APIs to these agents such that they get reliable, reusable, high quality outputs. What I'm going to do is I'm going to demonstrate Wordware's new MCP toolbox. And this allows you to build agentic MCPs.

You can turn your Wordware workflows into tools for your agents. And so I'm just going to grab one from the landing page as an example. And I'm picking this kind of competitor analysis because that's a flow that requires quite a lot of taste, quite a lot of reasoning, and also integration into both Twitter and Notion.

Rather than, you know, finding a Twitter MCP, I just use the kind of Twitter scrape tool, but went to Wordware. And then I've described what I really want from my competitor analysis. It's not just a generic whatever the LLM thinks. It's kind of gone into detail about what I care about.

And I could add even more details about my company and try and work out, you know, where do we differ? It then creates this analysis, writes the output to Notion, and then returns the URL in the output. And so I can easily do this. I can go to mcp.beta.wordware.ai.

So still in the early days, but yeah, we are rolling this out beyond beta fairly soon. And here's a toolbox I created earlier. I just added the competitor analysis after publishing this app earlier. I can enter that to Claude. And now I can use this tool inside my Claude.

And what's nice about Wordware is you can add multiple tools into this toolbox. So you can have a bunch of different tools that are grouped together that are all related or entirely disparate. But you can switch on and off different toolboxes for different tasks. But maybe let's do something like create a competitor analysis for anaerobic AI.

I hit this and now you can see it's going to use the Wordware tool. I can allow it once. I can allow it always. You know, there's nothing too bad that can go wrong here. So I'm just going to let it go. And now it's going to form this competitor analysis.

Here's one I made earlier. Cool. So now that's done. I can grab the link to the Notion page on the competitor analysis. Open that up. And we'll see a nicely formatted summary based on all the tweets from Anthropic. And we can see they care a lot about how they're tweeting.

And so we can learn from their style. And it's all in my Notion page, nicely formatted. And exactly where I'd want to find it again. So it's not just lost in the chats. So pretty exciting. We managed to build a highly reliable, highly repeatable, and highly aligned tool that allows our generic agent to be very specific and very powerful for doing that task that we wanted it to do.

And so we've really blurred the line between what's an agent and what's a tool and allowed our agent to offload tasks to something that's more powerful. Exactly how, you know, we do this already in teams and you have specialists for people, you know, whether it's the Avengers or your team in a company.

You can use Webware Toolbox to build these flows. You can use anything to build agentic MCPs. We hope you follow this pattern and give your tools time to think.

Just do it. (let your tools think for themselves) - Robert Chandler

Chapters

Transcript