Hi, I'm Philip and I'm the CEO at Wordware. Today I want to talk to you about what sucks about chat-based interfaces, how documents can actually solve those issues and how do they lead to background agents that do tasks for you in the background. So firstly, let's start with what are the problems with chat-based systems.
When I interact with Claude or OpenAI, it all seems very informal. I end up often creating workflows for myself using projects or just copy-pasting stuff. And in that way, when I'm kind of having these long conversations that I populate the context window, I realize that a lot of these things in the context window are just like gibberish and garbage.
And so we get context pollution. I also don't get to iterate in a structured manner. If I'm working with artifacts, it ends up that I basically make the context window dirty enough with not being able to you know, change one sentence that I really wanted to make sure that it's precise.
I also sometimes lack another level of forced clarity. ChatGPT often asks me that one particular question around, "Oh, in deep research, what would you like to actually find?" But it never actually asks me in the right way when it's not actually certain about some things and therefore not forcing me to clarity.
Also, there's a couple more issues about poor version control, limited reusability, model laziness. The more the context grows, the less, less, you know, the worst response I'm getting. Also, chat interfaces don't support any logical grouping or nesting in any way. And we also are interacting with a single abstraction layer.
We don't get to see and choose whether we want to specify every small detail of a particular task or just set it up in some way. Hence documents. Documents are actually the original way of specifying more complex systems. The first product requirements doc that dates probably to Noah's Ark around three and a half thousand years ago.
Don't check me on that one, however. And it's the first kind of take on someone explaining a more complex system to somebody who is not aware of how to build it. And therefore, documents are actually the ultimate way of humans communicating these more complex ideas. And so in that way, we get forced clarity, opportunity, which is great.
But next problem with chat and one of the biggest problems with chat is concurrency. We have concurrency of one with all chat-based systems. We need to be sitting there and we are getting like the inklings of how the future will look like when Manus or Deep Research are running in the background and actually doing things for us.
It feels great. It feels great that there's something in the background that's happening. So now let's riff off this idea of the background agents that I'm going to introduce. So we've been doing work and computers have been doing work for us in the background for quite a while. And, you know, we've kind of created workflows which are kind of handcrafted.
You can think of the Zapier of this world. And then we just barely started to create specialized agents. That basically means that at some stage it kind of had an if-else statement that was somewhat fuzzy and it made one or two decisions. And as we can see on this diagram, when the importance of some workflow is high and the occurrence is high, we actually end up using handcrafted workflows.
We're only now entering an area where the general agents are starting to take some decisions. But whenever we have kind of higher importance, we don't let that general agents to kind of enter our life. So how can we remedy this? We remedy this by introducing a human in the loop.
So essentially now with the human in the loop, the agent can do a bunch of work and we get to approve, reject it, change the way that it's created there and output or even fix its logic entirely. They normally react to some kind of implicit or explicit user intent or trigger.
So you can think of these background agents as being activated by a sent email, send Slack message, maybe your meeting. That could be an implicit trigger that you had a meeting with a name party and that party with investors and that could prompt you to update your CRM. And with having a bunch of these ambient agents, you end up having like having to create protocols of how humans and the AIs communicate between themselves.
And that basically means that, you know, both humans can control their agents and its outputs, but also different agents can communicate with each other to educate themselves around, you know, sources of data. An agent, one agent could be communicating with a more general agent around what is in your notion.
And in that way we'll probably start with a prosumer first where a bunch of agents are working in the background. But very soon that starts to be about an organization and we'll start having organizations which have their own agents and also external agents. And in that way, we might even get agents which manage other humans, which sounds ridiculous at the beginning, but actually, you know, that could be just an agent which creates Jira tickets for all of your engineers.
And in that way we basically create the graph of the enterprise of the future. I think when I think about these background agents, firstly, kind of working for the prosumer, maybe managing your emails and just making sure that you have more time for yourself. I think this idea like so naturally represents a bottom up movement to enterprises, which will be more slow moving and trying to make sure that all of their agentic tools that these agents need to be using are verified and have the right authority and they have the right permissions and they don't mess things up.
So as I am thinking about the future of the agent economy, I think that the stochastic mindset of like we need to adopt that because we essentially need to lead with leverage over uncertainty. If something that you don't fully get how it works closes your clients and delivers on business value, 99.9% of the time we're not going to care about the fact that we don't understand what's happening in that 0.1% of the time.
We're just going to make sure that the impact of the 0.1% of the time is not catastrophic. I also think humans will manage a bunch of agents and that's why taste and intent is so important. You will need to imbue your own personal brand onto agents and take responsibility for their actions.
We'll also need a lot more communication protocols between humans and AI and also in between agents. The MCP is the first protocol that kind of sets it up, but I think it lacks more information about what are the constraints of a particular agent, what are the authority that it needs to have in order to act, whether it needs approval from human in the loop, etc.
Right now when I think about that kind of humans managing agents, we only see this properly in coding and in coding. We basically see people who are good engineers who are good both at IC work and at managing a team of interns really, really, really being able to take the benefit of the AI revolution.
A lot of excellent IC engineers end up saying, oh, I don't want to use AI. It's actually not that good as people are saying. And it probably isn't for them, but their bar for code might be too high. They might want to have everything optimized in the right manner.
And in this way, you know, this is the first time where engineers are managing the swarm of agents and they need to be good at managing in order to actually distill leverage and distill benefit for their organization. So just to wrap up, I think the concurrence of one of chat-based systems and the pollution you get for playing with them, it almost is like brainstorming.
But after brainstorming, you need to sit down and create a right document to explain what an agent should be doing. This is only needed for repeatable processes. Once you have a repeatable process that you trust and you think will be very useful, you can hook it up to a trigger.
That can be either, you know, a cron drop or it could be a Gmail trigger or it could be an implicit trigger. Then that agent is then able to act in the background. The latency matters a little bit less there and only surface issues to you once it is struggling with something or needs your approval.
Therefore, your work is mostly around creating these assignments for the agents, making sure that your taste is imbued there and then approving the results of the work, making sure you trust it more and more and more as you keep going. In that way, we are creating swarms of agents which are working in the background.
And our main job is to swipe left and right as if it's Tinder and approve and edit the results of the work of the agents. I think from there, the prosumer market is going to adopt this much more widely and we're going to see it slowly entering the enterprise market.
And I'm super excited about the enterprises creating most incredible tools that are going to be agentic and they are going to be used by the state of the art newest models, but the tools are going to be demoed. So a very clear progression for the future. Let's see if it's true.
Thank you so much. I'm Philip, the co-founder and CEO of WorldWare. And at WorldWare, we actually enable these background agents to work. Come build yours.