Alice 2: Building and Scaling an AI Agent During HyperGrowth | 11x

everyone how's it going my name is Sherwood I am one of the tech leads here at 11x I lead engineering for our Alice products and today I'm joined by Keith our head of growth who is the product manager for this Alice project now an 11x for those of you who are unfamiliar is a company that's building digital workers we have two digital workers today the first is Alice she's our AI SDR and the second is Julian he's an AI voice agent and we've got more workers on the way I want to take everybody back to September 2024 it's for most people not long ago for us you know it's half the company's history we just crossed 10 million car we just announced our series a then our series B released 15 days later with all this chaos going on we relocated our whole team and company from London that San Francisco to our beautiful new office with our beautiful new CTO and you know at the same time we we also bought a rocket because we're 11x and you know during all this chaos we chose this moment to rebuild our core product from the ground up and the reason we did that is because we truly felt at the time and proved to be true is that agents were the future so today's talk I want to first tell you why we felt the need to rebuild Alice from scratch hopefully I think everyone is probably in agreement about agents being the future then I'll tell you how we did it we built this enterprise grade AI STR in just three months then I want to talk you through one of the challenges that we experienced which was finding the right agent architecture and I'll wrap up with some reflections on building agents and some closing thoughts so let's start with the decision to rebuild why did we feel like we needed to rebuild our core product from scratch at such a critical moment well to understand that question you first need to understand Alice one and Alice one was our original AI SDR product the main thing that you could do with Alice was create these custom AI powered outreach campaigns and there were five steps involved in campaign creation the first step is defining your audience that's when you identify the people that you'd like to sell to and in the second step you describe your offer this is the products or service that you're hoping to sell then in the third and fourth step you construct your sequence and also tweak the AI generated messaging and finally when everything is to your liking you move on to the last step which is you launch the campaign and that's when Alice will begin sourcing leads that match your ICP researching them writing those customized emails and in general just executing the sequence that you've built for every lead that enters the campaign now Alice one was a big success by a lot of different metrics but we wouldn't really consider her a true digital worker and that's for a lot of reasons for one there was a lot of button clicking more than you would probably expect of a digital worker and you also probably saw there was a lot of manual input especially on that offers page our lead research was also relatively basic we weren't doing deep research or scraping the web or anything like that and downstream that would lead to relatively uninspiring personalization in our emails and there on top of that Alice wasn't able to handle replies automatically she wasn't able to to answer customers questions and finally there was no real self-learning component she wasn't getting better over time meanwhile while we were building Alice one the industry was evolving around us in March of 2023 we got GPT for we got the first cloud model and we got the first agent frameworks then later that year we got cloud two and we got function calling in the open AI API then in January of 2024 we got a more production ready agent framework in the form of land graph then in March we got cloud three in May we got GPT for oh and finally in September we got the replet agent which for us was the first example of a truly mind-blowing agentic software product and just a double click into the replet agent a little bit this really blew our minds it convinced us of two things first the agents were going to be really powerful they could build entire apps from scratch and second that they're here today they're ready for production so with that in mind we developed a new vision for Alice centered on seven agentic capabilities and the first one was chat we believe that users should mostly interact with Alice through chat the way they would interact with a human team member secondly users should be able to upload internal documents their their websites meeting recordings to a knowledge base and in doing so they would train Alice third we should use an AI agent for lead sourcing that actually considers the quality and and fit of each lead rather than than a dumb filter search number four we should do deep research on every lead and that should lead to number five which is true personalization in those emails and then finally we believe that I should be able to handle inbound messages automatically answering questions and booking meetings also she should be self-learning she should incorporate the insights from all of the campaigns she's running to optimize the performance of your account so that was our vision and with that in place we start we set about to to rebuild Alice from scratch and in short this was a pretty aggressive push for the company it it took us just three months from the first commit to migrating our last business customer we initially staffed just two engineers on building the agent after developing the POC we we brought in more resources we had one project manager our one and only Keith and we had about 300 customers that needed to be migrated from our original platform to the new one and that was growing by the day we had our go-to-market team was just really not slowing down there were a few key decisions that we made at the outset of this project the first is that we wanted to start from scratch we didn't want Alice to to be encumbered by Alice one in any way so new repo new infrastructure new team we also didn't want to reinvent the wheel we were going to be taking on a lot of risk with some unfamiliar technologies like the agent and the knowledge base we didn't want to add additional risk through technologies that we didn't understand so we chose a very vanilla tech stack and number three we wanted to leverage vendors as much as possible to move really quickly we didn't want to be building non-essential components this is the tech stack that we went with I won't go into too much detail here but I thought people be interested to see and here are some of the vendors that we chose to leverage and work with we I can't go into detail with every one of these vendors but they were all essential to our access and wanted to shout everyone out that that has been useful of course one of the most important vendors we chose to work with was Langchain and we knew that we were going to need a really good partner from the start if we're gonna pull this off Langchain was a very natural choice for us they were a clear industry leader in AI dev tools and AI infrastructure they had an agent framework ready to go that agent framework had cloud hosting and observability so we knew we were going to be able to get product get to production and that once our agent was in production we would understand how it's performing we also had some familiarity from Alice one we were using the the core SDK with Alice one and then Langchain also had TypeScript support which is important to us as a TypeScript shop and last but not least the customer support from the Langchain team was just incredible they really felt like an extension of our team they ramped us up on Lang graph and the Langchain ecosystem and on agents in general and we are so grateful to them for that help in terms of the products that we use today we use pretty much the entire suite and now I want to talk you talk you through the one of the main challenges that we encountered while building this while building Alice - which was finding the right agent architecture and you'll remember the main feature of Alice was campaign creation so we wanted Alice the Alice agent to guide users through campaign creation the same way that a repli agent would guide you through creating an app we tried three different architectures for this the first was react the second was a workflow and then finally we ended on a multi agent system so now I want to talk you through each of these how it works in detail and why it didn't work for our use case until we arrived at multi agent let's start with react well react is a JavaScript framework for building user interfaces but that's not what I mean here I mean the react model of an AI agent which I think other people have talked about earlier today this is a model that was invented by Google researchers back in 2022 and it stands for reason and act and basically what these researchers observed is if you include reasoning traces in the conversation context the agent performs better than it otherwise would and with a react agent the execution loop is split into three parts there's reasoning where the agent thinks about what to do there's action where the agent actually takes action for example performing a tool call and then finally there's observe where the agent observed the new state of the world after performing the the action and I guess reacto wasn't a very good name but as I mentioned reasoning traces lead to better performance in the agent this is our implementation of a react agent it consists of a just one node and 10 or 20 tools and it's not very impressive looking I know but this simplicity is actually one of the main benefits of the react architecture in my opinion well why do we have so many tools well there are lots of different things that need to happen in campaign creation we need to fetch leads from our database we need to insert new DB entities and draft emails and all of those things become a tool the react loop that I mentioned on the previous slide that's implemented inside of the assistant node and when the assistant actually performs an action that is manifested in the in the form of a tool call which is then executed by the tool node one thing to note about the react agent is that it runs to completion for every turn so if the user says hello and then they say I'd like to create a campaign that would be two turns and the react agent runs to completion each time that's going to become relevant later and here are some of the tools that we we implemented and attached to our agent unfortunately Alice 2 predated MCP and so we didn't use an MCP server or any third-party tool registries and a few things I want to tell you about tools before we can move on the first is that tools are necessary to take action so anytime you want your agent to do anything on the outside world for example call an API or write a file you're going to need a tool to do that they're also necessary to access information beyond the context window if you think about it what your agent knows is limited to three things the the conversation context the prompt and the model weights and if you wanted to know anything beyond that you need to give it a tool for example a web search tool and that's essentially the concept behind RAG tools can also be used to call other agents this is one of the easiest and simplest ways to to get started with a multi-agent system and last but not least tools are preferable over skills so this is a framework I came up with essentially if you think about it if someone asked you to do something like like perform a complex calculation you can do that either through a tool like a calculator or maybe you have the skill of the mental arithmetic that's required to perform that calculation and in general it's better to use a tool and then to tend to use a skill because this minimizes the amount of tokens you're using in the context to accomplish that goal what are the strengths of the reactor architecture well I mentioned one already that is that it is extremely simple we basically never needed to revise our agent structure later on it was also great at handling arbitrary user inputs over multiple turns this is because the graph is running to completion each time it allows the the user to say something in step three that's related to step one without the the agent getting confused it's actually robust to that so that was a great advantage but it had some issues for example the react agent was kind of bad at tools we had attached a lot of tools and as you know that what can sometimes happen when you do that is the agent will struggle with which tool to call and in what order And this would sometimes lead to infinite loops where the agent is repeatedly trying to accomplish some part of campaign creation but not succeeding And when those infinite loops would go on for a while we would get a recursion limit error which is effectively the the agent equivalent of a stack overflow and also the outputs that we were getting from this version of the agent were relatively mediocre the audiences the sequences the emails They they they just weren't that good in our hypothesis here was that because there's just one agent and really like one set of prompts that are responsible for doing the entire campaign creation process It wasn't really good at any particular point So what can we do like how can we address these issues in our case we chose to add structure which led us to implementing a workflow A workflow is defined by Anthropic as a system where LLMs and tools are orchestrated through predefined code paths In this screenshot and quote from they both come from an excellent blog post by Anthropic called building effective agents highly recommend checking it out I shamelessly lifted it Importantly workflows are different from agents and this is one of the things that the agent community has been debating a lot on on Twitter for example It's the reason why we have the term agentic for sometimes describing a system as opposed to agent A system could be agentic but not necessarily an agent per se Workflows are highly structured as you probably inferred from that predefined code paths piece The LLM is not choosing how to how to orchestrate the code the LMs are just being called within these predefined code paths And last but not least workflows are not really a new technology.

We've had them for a long time in other forums And the most famous form is probably the data engineering dual air flow And the clicker is okay there we go This was our implementation of a workflow campaign creation agent It's obviously a lot more complex than our react agent We now have 15 nodes.

They're split across five different stages and these stages correspond to the different steps of campaign creation that I mentioned before Interestingly this graph unlike the react agent doesn't run to completion for every turn It only runs completion once for the entire campaign creation process and the way that we get user input or feedback Feedback at certain points within the graph execution is through the use of something called node interrupts which is a line graph feature There were a number of strengths involved with the workflow architecture It basically solved all of the problems we observed with react for one we no longer had issues with tools because we just didn't have tools we've we've replaced them now with these specialized nodes like a write email node And we've also got a clearly defined execution flow with a fixed number of steps so no more infinite loops no more recursion limit errors On top of that we got much better outputs of the the emails and sequences that we were getting from this version of the agent were much better and that's because you force the agent to go through these particular steps But the workflow architecture did have issues for one it was extremely complex and now our front-end campaign creation flow experience was coupled with the the architecture of our agent and we would have to change that architecture in that graph structure anytime we wanted to make changes to the campaign creation experience So super laborious and annoying It also didn't support jumping around within the campaign creation flow And that's because the graph doesn't run to completion every time when you get to step 3 and you it's you stop using a node interrupt to collect feedback on that step You can really only respond to the to what's happening in step 3 you can't jump back to step 1 So clearly workflows were not going to be it for our use case.

What else can we can we try? Well after some soul-searching we came across a blog post by Langchain that explained how to build a customer support agent using a multi agent architecture and this is the blog post that gave us the insight that we needed for our use case And a multi agent system is one that's a hierarchical approach to building an AI agent and this pattern There's one supervisor and there are many sub agents that are specialized and the supervisor is responsible for interacting with the user and for routing tasks to sub agents When the sub agents will then fulfill those tasks and they'll escalate back to the supervisor when they're complete And we really devoured this blog post by Langchain We went a little crazy in the process but ultimately found a version of this that worked for our use case And here's what that looks like we have a graph that complicated that a multi agent graph that accomplishes all of campaign creation except for audience creation Which we kept separate for different reasons And you can see here at the top is our supervisor node.

It's close to this the start of the graph and then we have four specialist sub agents we have a researcher We have something that that generates something called a positioning report which is how you should position your your product or service for this particular lead Then we have a LinkedIn message writer and finally we have an email writer And this multi agent architecture it gave us the best of both worlds We got the flexibility of the react agent and then we got the the performance of the workflow Now I want to share a couple reflections on building agents from this experience and the first is that Simplicity is key all of that structure and scaffolding it can provide performance gains in the short term But over the long term it locks you into a structure that can be counterproductive and related to this is that a new model release can really change everything Amjad from replet told us this about the replet agent he said it wasn't really working until sonnet 3.5 came out and then they dropped it in and everything was magic and that's really true It's also useful to think of your agent as a human co-worker or a team of co-workers In our case we had different mental models.

We thought that the the agent was a was a user flow within Our product or a directed graph and those were the wrong mental models and they led us to implement the wrong architecture You should also break big tasks down into small tasks in our case the big task was the campaign creation But there were small tasks like writing an email within that and it became easier to implement the agent once we broke it down into the smaller component tasks Tools are preferable over skills don't try to make your agent too smart Just give it the right tools and tell it how to use them and Then last but not least don't forget about prompt engineering It's easy to forget that your agent is just a series of LLM calls within a while loop If your agent isn't performing well, you should consider going back and doing some prompt engineering that might unlock the performance you're looking for And I wish we had time for a demo, but I don't but I do have this QR code I'll leave this up for a moment if you're not able to get to it now the slides will be available afterwards You can check out what we've built And Alice - went live in January and now the results have been pretty exciting.

She's now sourced close to 2 million leads I think these numbers a little out of date and we've sent close to 3 million messages and generated about 21,000 replies Over the last month or so her replies or her plot reply rate is around 2% Which is on par with a human SDR and we're starting to see that climb as we implement self-learning and some other optimizations In terms of future plans, we're excited to integrate Alice and Julian our voice agents so that these two agents can Engage leads across multiple channels on both inbound and outbound We're also really excited about self-learning we've done some work here, but I wasn't able to talk more about it And then finally we're really excited about applications of new technologies like computer use and memory and reinforcement learning Yeah, if any of this is you know sounds exciting and you're sick of building software you want to build digital you know workers Message showed myself for you know anyone 11x we need like 11 times as many people like 11 times as fast So this is a bit of a show but like please please like we need a lot of people to you know to build the future Cool, thanks everyone guys cheers harrison Thank you Thank you.

Thank you. Thank you. Thank you.

Alice 2: Building and Scaling an AI Agent During HyperGrowth | 11x | LangChain Interrupt

Transcript