Alice 2: Building and Scaling an AI Agent During HyperGrowth | 11x

00:00:00.000 | everyone how's it going my name is Sherwood I am one of the tech leads here

00:00:14.620 | at 11x I lead engineering for our Alice products and today I'm joined by Keith

00:00:18.760 | our head of growth who is the product manager for this Alice project now an

00:00:24.360 | 11x for those of you who are unfamiliar is a company that's building digital

00:00:27.900 | workers we have two digital workers today the first is Alice she's our AI SDR and

00:00:32.640 | the second is Julian he's an AI voice agent and we've got more workers on the

00:00:36.420 | way I want to take everybody back to September 2024 it's for most people not

00:00:42.580 | long ago for us you know it's half the company's history we just crossed 10

00:00:46.860 | million car we just announced our series a then our series B released 15 days

00:00:53.220 | later with all this chaos going on we relocated our whole team and company from

00:00:58.520 | London that San Francisco to our beautiful new office with our beautiful new CTO and

00:01:05.840 | you know at the same time we we also bought a rocket because we're 11x and you

00:01:13.140 | know during all this chaos we chose this moment to rebuild our core product from

00:01:19.380 | the ground up and the reason we did that is because we truly felt at the time and

00:01:24.360 | proved to be true is that agents were the future so today's talk I want to first

00:01:29.760 | tell you why we felt the need to rebuild Alice from scratch hopefully I think

00:01:33.300 | everyone is probably in agreement about agents being the future then I'll tell you

00:01:37.020 | how we did it we built this enterprise grade AI STR in just three months then I

00:01:42.000 | want to talk you through one of the challenges that we experienced which was

00:01:44.500 | finding the right agent architecture and I'll wrap up with some reflections on

00:01:48.480 | building agents and some closing thoughts so let's start with the decision to

00:01:52.800 | rebuild why did we feel like we needed to rebuild our core product from scratch at

00:01:56.520 | such a critical moment well to understand that question you first need to

00:02:01.260 | understand Alice one and Alice one was our original AI SDR product the main thing

00:02:07.020 | that you could do with Alice was create these custom AI powered outreach campaigns

00:02:11.100 | and there were five steps involved in campaign creation the first step is

00:02:15.320 | defining your audience that's when you identify the people that you'd like to

00:02:18.380 | sell to and in the second step you describe your offer this is the products

00:02:23.000 | or service that you're hoping to sell then in the third and fourth step you

00:02:27.980 | construct your sequence and also tweak the AI generated messaging and finally

00:02:32.260 | when everything is to your liking you move on to the last step which is you

00:02:35.240 | launch the campaign and that's when Alice will begin sourcing leads that

00:02:38.540 | match your ICP researching them writing those customized emails and in general just

00:02:44.100 | executing the sequence that you've built for every lead that enters the campaign

00:02:46.700 | now Alice one was a big success by a lot of different metrics but we wouldn't

00:02:53.420 | really consider her a true digital worker and that's for a lot of reasons for one

00:02:57.320 | there was a lot of button clicking more than you would probably expect of a

00:03:01.220 | digital worker and you also probably saw there was a lot of manual input

00:03:05.080 | especially on that offers page our lead research was also relatively basic we

00:03:11.240 | weren't doing deep research or scraping the web or anything like that and

00:03:16.220 | downstream that would lead to relatively uninspiring personalization in our emails

00:03:20.660 | and there on top of that Alice wasn't able to handle replies automatically she

00:03:26.480 | wasn't able to to answer customers questions and finally there was no real

00:03:31.220 | self-learning component she wasn't getting better over time meanwhile while we

00:03:36.380 | were building Alice one the industry was evolving around us in March of 2023 we

00:03:41.700 | got GPT for we got the first cloud model and we got the first agent frameworks then

00:03:47.140 | later that year we got cloud two and we got function calling in the open AI API then

00:03:53.200 | in January of 2024 we got a more production ready agent framework in the form of

00:03:56.960 | land graph then in March we got cloud three in May we got GPT for oh and finally

00:04:04.580 | in September we got the replet agent which for us was the first example of a

00:04:08.940 | truly mind-blowing agentic software product and just a double click into the

00:04:12.760 | replet agent a little bit this really blew our minds it convinced us of two things

00:04:17.600 | first the agents were going to be really powerful they could build entire apps from

00:04:21.680 | scratch and second that they're here today they're ready for production so with

00:04:27.620 | that in mind we developed a new vision for Alice centered on seven agentic

00:04:31.180 | capabilities and the first one was chat we believe that users should mostly

00:04:35.060 | interact with Alice through chat the way they would interact with a human team

00:04:37.940 | member secondly users should be able to upload internal documents their their

00:04:43.160 | websites meeting recordings to a knowledge base and in doing so they would

00:04:46.400 | train Alice third we should use an AI agent for lead sourcing that actually

00:04:51.900 | considers the quality and and fit of each lead rather than than a dumb filter

00:04:57.140 | search number four we should do deep research on every lead and that should

00:05:01.340 | lead to number five which is true personalization in those emails and then

00:05:06.140 | finally we believe that I should be able to handle inbound messages

00:05:10.760 | automatically answering questions and booking meetings also she should be

00:05:15.140 | self-learning she should incorporate the insights from all of the campaigns she's

00:05:18.460 | running to optimize the performance of your account so that was our vision and

00:05:23.720 | with that in place we start we set about to to rebuild Alice from scratch and in

00:05:28.800 | short this was a pretty aggressive push for the company it it took us just three

00:05:32.360 | months from the first commit to migrating our last business customer we

00:05:36.560 | initially staffed just two engineers on building the agent after developing the POC we

00:05:40.860 | we brought in more resources we had one project manager our one and only Keith

00:05:45.620 | and we had about 300 customers that needed to be migrated from our original

00:05:50.280 | platform to the new one and that was growing by the day we had our go-to-market

00:05:54.120 | team was just really not slowing down there were a few key decisions that we

00:05:59.400 | made at the outset of this project the first is that we wanted to start from

00:06:02.760 | scratch we didn't want Alice to to be encumbered by Alice one in any way so new

00:06:06.840 | repo new infrastructure new team we also didn't want to reinvent the wheel we

00:06:12.000 | were going to be taking on a lot of risk with some unfamiliar technologies like

00:06:14.760 | the agent and the knowledge base we didn't want to add additional risk

00:06:18.420 | through technologies that we didn't understand so we chose a very vanilla tech

00:06:21.480 | stack and number three we wanted to leverage vendors as much as possible to

00:06:26.580 | move really quickly we didn't want to be building non-essential components this is

00:06:31.840 | the tech stack that we went with I won't go into too much detail here but I

00:06:34.520 | thought people be interested to see and here are some of the vendors that we

00:06:39.040 | chose to leverage and work with we I can't go into detail with every one of

00:06:43.180 | these vendors but they were all essential to our access and wanted to shout

00:06:46.360 | everyone out that that has been useful of course one of the most important

00:06:51.800 | vendors we chose to work with was Langchain and we knew that we were going to

00:06:55.360 | need a really good partner from the start if we're gonna pull this off Langchain was a

00:06:58.780 | very natural choice for us they were a clear industry leader in AI dev tools

00:07:02.800 | and AI infrastructure they had an agent framework ready to go that agent framework

00:07:06.940 | had cloud hosting and observability so we knew we were going to be able to get

00:07:10.060 | product get to production and that once our agent was in production we would

00:07:13.400 | understand how it's performing we also had some familiarity from Alice one we were

00:07:17.440 | using the the core SDK with Alice one and then Langchain also had TypeScript

00:07:22.360 | support which is important to us as a TypeScript shop and last but not least the

00:07:26.460 | customer support from the Langchain team was just incredible they really felt

00:07:30.060 | like an extension of our team they ramped us up on Lang graph and the Langchain

00:07:33.060 | ecosystem and on agents in general and we are so grateful to them for that help in

00:07:38.840 | terms of the products that we use today we use pretty much the entire suite and now

00:07:44.560 | I want to talk you talk you through the one of the main challenges that we

00:07:47.400 | encountered while building this while building Alice - which was finding the

00:07:51.240 | right agent architecture and you'll remember the main feature of Alice was

00:07:56.420 | campaign creation so we wanted Alice the Alice agent to guide users through

00:08:01.000 | campaign creation the same way that a repli agent would guide you through creating

00:08:04.680 | an app we tried three different architectures for this the first was

00:08:09.840 | react the second was a workflow and then finally we ended on a multi agent

00:08:14.760 | system so now I want to talk you through each of these how it works in detail and

00:08:19.360 | why it didn't work for our use case until we arrived at multi agent let's start

00:08:24.340 | with react well react is a JavaScript framework for building user interfaces

00:08:28.420 | but that's not what I mean here I mean the react model of an AI agent which I

00:08:32.680 | think other people have talked about earlier today this is a model that was

00:08:35.920 | invented by Google researchers back in 2022 and it stands for reason and act and

00:08:41.360 | basically what these researchers observed is if you include reasoning traces in the

00:08:46.120 | conversation context the agent performs better than it otherwise would and with

00:08:50.540 | a react agent the execution loop is split into three parts there's reasoning where

00:08:55.480 | the agent thinks about what to do there's action where the agent actually takes action

00:08:59.860 | for example performing a tool call and then finally there's observe where the

00:09:04.120 | agent observed the new state of the world after performing the the action and I

00:09:08.180 | guess reacto wasn't a very good name but as I mentioned reasoning traces lead to

00:09:14.120 | better performance in the agent this is our implementation of a react agent it

00:09:19.580 | consists of a just one node and 10 or 20 tools and it's not very impressive

00:09:25.280 | looking I know but this simplicity is actually one of the main benefits of the

00:09:29.000 | react architecture in my opinion well why do we have so many tools well there are

00:09:34.060 | lots of different things that need to happen in campaign creation we need to fetch leads from our

00:09:38.040 | database we need to insert new DB entities and draft emails and all of those

00:09:42.900 | things become a tool the react loop that I mentioned on the previous slide that's

00:09:48.180 | implemented inside of the assistant node and when the assistant actually performs

00:09:53.040 | an action that is manifested in the in the form of a tool call which is then

00:09:56.760 | executed by the tool node one thing to note about the react agent is that it runs

00:10:02.140 | to completion for every turn so if the user says hello and then they say I'd like to

00:10:06.720 | create a campaign that would be two turns and the react agent runs to completion each time

00:10:10.680 | that's going to become relevant later and here are some of the tools that we we implemented

00:10:16.860 | and attached to our agent unfortunately Alice 2 predated MCP and so we didn't use an MCP server

00:10:23.160 | or any third-party tool registries and a few things I want to tell you about tools before we can

00:10:28.860 | move on the first is that tools are necessary to take action so anytime you want your agent to do anything on the outside world for example

00:10:36.580 | call an API or write a file you're going to need a tool to do that they're also necessary to access information beyond the context window if you think about it what your agent

00:10:46.720 | knows is limited to three things the the conversation context the prompt and the model weights and if you wanted to know anything beyond that you need to give it a tool for example a web search tool and that's essentially the concept behind RAG

00:10:59.720 | tools can also be used to call other agents this is one of the easiest and simplest ways to to get started with a multi-agent system and

00:11:09.860 | last but not least tools are preferable over skills so this is a framework I came up with essentially if you think about it if someone asked you to do something like

00:11:17.860 | like perform a complex calculation you can do that either through a tool like a calculator or maybe you have the skill of the mental arithmetic that's required to perform that calculation and in general it's better

00:11:29.580 | to use a tool and then to tend to use a skill because this minimizes the amount of tokens you're using in the context to accomplish that goal

00:11:37.980 | what are the strengths of the reactor architecture well I mentioned one already that is that it is extremely simple we basically never needed to revise our agent structure later on

00:11:48.340 | it was also great at handling arbitrary user inputs over multiple turns this is because the graph is running to completion each time it allows the the user to say something

00:11:59.440 | in step three that's related to step one without the the agent getting confused it's actually robust to that so that was a great advantage

00:12:07.840 | but it had some issues for example the react agent was kind of bad at tools we had attached a lot of tools and as you know that what can sometimes happen when you do that is the agent will struggle with which tool to call and in what order

00:12:20.840 | And this would sometimes lead to infinite loops where the agent is repeatedly trying to accomplish some part of campaign creation but not succeeding

00:12:27.440 | And when those infinite loops would go on for a while we would get a recursion limit error which is effectively the the agent equivalent of a stack overflow

00:12:36.240 | and also the outputs that we were getting from this version of the agent were relatively mediocre the audiences the sequences the emails

00:12:44.640 | They they they just weren't that good in our hypothesis here was that because there's just one agent and really like one set of prompts that are responsible for doing the entire campaign creation process

00:12:54.840 | It wasn't really good at any particular point

00:12:57.040 | So what can we do like how can we address these issues in our case we chose to add structure which led us to implementing a workflow

00:13:05.140 | A workflow is defined by Anthropic as a system where LLMs and tools are orchestrated through predefined code paths

00:13:13.240 | In this screenshot and quote from they both come from an excellent blog post by Anthropic called building effective agents highly recommend checking it out

00:13:19.780 | I shamelessly lifted it

00:13:21.380 | Importantly workflows are different from agents and this is one of the things that the agent community has been debating a lot on on Twitter for example

00:13:29.700 | It's the reason why we have the term agentic for sometimes describing a system as opposed to agent

00:13:35.080 | A system could be agentic but not necessarily an agent per se

00:13:38.820 | Workflows are highly structured as you probably inferred from that predefined code paths piece

00:13:46.620 | The LLM is not choosing how to how to orchestrate the code the LMs are just being called within these predefined code paths

00:13:53.180 | And last but not least workflows are not really a new technology. We've had them for a long time in other forums

00:13:59.360 | And the most famous form is probably the data engineering dual air flow

00:14:02.460 | And the clicker is okay there we go

00:14:08.200 | This was our implementation of a workflow campaign creation agent

00:14:12.620 | It's obviously a lot more complex than our react agent

00:14:15.760 | We now have 15 nodes. They're split across five different stages and these stages correspond to the different steps of campaign creation that I mentioned before

00:14:25.000 | Interestingly this graph unlike the react agent doesn't run to completion for every turn

00:14:30.740 | It only runs completion once for the entire campaign creation process and the way that we get user input or feedback

00:14:36.480 | Feedback at certain points within the graph execution is through the use of something called node interrupts which is a line graph feature

00:14:42.220 | There were a number of strengths involved with the workflow architecture

00:14:47.960 | It basically solved all of the problems we observed with react for one we no longer had issues with tools because we just didn't have tools we've we've replaced them now with these specialized nodes like a write email node

00:14:59.700 | And we've also got a clearly defined execution flow with a fixed number of steps so no more infinite loops no more recursion limit errors

00:15:08.700 | On top of that we got much better outputs of the the emails and sequences that we were getting from this version of the agent were much better and that's because you force the agent to go through these particular steps

00:15:19.380 | But the workflow architecture did have issues for one it was extremely complex and now our front-end campaign creation flow experience was coupled with the the architecture of our agent and we would have to change that architecture in that graph structure anytime we wanted to make changes to the campaign creation experience

00:15:36.540 | So super laborious and annoying

00:15:39.480 | It also didn't support jumping around within the campaign creation flow

00:15:43.680 | And that's because the graph doesn't run to completion every time when you get to step 3 and you it's you stop using a node interrupt to collect feedback on that step

00:15:51.460 | You can really only respond to the to what's happening in step 3 you can't jump back to step 1

00:15:56.880 | So clearly workflows were not going to be it for our use case. What else can we can we try?

00:16:04.820 | Well after some soul-searching we came across a blog post by Langchain that explained how to build a customer support agent using a multi agent architecture and this is the blog post that gave us the insight that we needed for our use case

00:16:17.760 | And a multi agent system is one that's a hierarchical approach to building an AI agent and this pattern

00:16:25.320 | There's one supervisor and there are many sub agents that are specialized and the supervisor is responsible for interacting with the user and for routing tasks to sub agents

00:16:34.540 | When the sub agents will then fulfill those tasks and they'll escalate back to the supervisor when they're complete

00:16:40.260 | And we really devoured this blog post by Langchain

00:16:44.440 | We went a little crazy in the process but ultimately found a version of this that worked for our use case

00:16:49.820 | And here's what that looks like we have a graph that complicated that a multi agent graph that accomplishes all of campaign creation except for audience creation

00:16:58.540 | Which we kept separate for different reasons

00:17:00.540 | And you can see here at the top is our supervisor node. It's close to this the start of the graph and then we have four specialist sub agents we have a researcher

00:17:08.440 | We have something that that generates something called a positioning report which is how you should position your your product or service for this particular lead

00:17:15.260 | Then we have a LinkedIn message writer and finally we have an email writer

00:17:20.540 | And this multi agent architecture it gave us the best of both worlds

00:17:25.540 | We got the flexibility of the react agent and then we got the the performance of the workflow

00:17:30.380 | Now I want to share a couple reflections on building agents from this experience and the first is that

00:17:36.580 | Simplicity is key all of that structure and scaffolding it can provide performance gains in the short term

00:17:42.260 | But over the long term it locks you into a structure that can be counterproductive and related to this is that a new model release can really change everything

00:17:50.260 | Amjad from replet told us this about the replet agent he said it wasn't really working until sonnet 3.5 came out and then they dropped it in and everything was magic and that's really true

00:18:00.020 | It's also useful to think of your agent as a human co-worker or a team of co-workers

00:18:05.620 | In our case we had different mental models. We thought that the the agent was a was a user flow within

00:18:12.260 | Our product or a directed graph and those were the wrong mental models and they led us to implement the wrong architecture

00:18:16.980 | You should also break big tasks down into small tasks in our case the big task was the campaign creation

00:18:24.180 | But there were small tasks like writing an email within that and it became easier to implement the agent once we broke it down into the smaller component tasks

00:18:31.140 | Tools are preferable over skills don't try to make your agent too smart

00:18:36.500 | Just give it the right tools and tell it how to use them and

00:18:39.380 | Then last but not least don't forget about prompt engineering

00:18:43.380 | It's easy to forget that your agent is just a series of LLM calls within a while loop

00:18:48.420 | If your agent isn't performing well, you should consider going back and doing some prompt engineering that might unlock the performance you're looking for

00:18:56.020 | And I wish we had time for a demo, but I don't but I do have this QR code

00:19:01.620 | I'll leave this up for a moment if you're not able to get to it now the slides will be available afterwards

00:19:05.740 | You can check out what we've built

00:19:07.740 | And Alice - went live in January and now the results have been pretty exciting. She's now sourced close to 2 million leads

00:19:16.380 | I think these numbers a little out of date and we've sent close to 3 million messages and generated about 21,000 replies

00:19:22.460 | Over the last month or so her replies or her plot reply rate is around 2%

00:19:27.180 | Which is on par with a human SDR and we're starting to see that climb as we implement self-learning and some other optimizations

00:19:35.660 | In terms of future plans, we're excited to integrate Alice and Julian our voice agents so that these two agents can

00:19:42.460 | Engage leads across multiple channels on both inbound and outbound

00:19:46.540 | We're also really excited about self-learning we've done some work here, but I wasn't able to talk more about it

00:19:50.860 | And then finally we're really excited about applications of new technologies like computer use and memory and reinforcement learning

00:19:59.420 | Yeah, if any of this is you know sounds exciting and you're sick of building software you want to build digital you know workers

00:20:05.980 | Message showed myself for you know anyone 11x we need like 11 times as many people like 11 times as fast

00:20:11.980 | So this is a bit of a show but like please please like we need a lot of people to you know to build the future

00:20:18.140 | Cool, thanks everyone guys cheers harrison

00:20:21.020 | Thank you

00:20:23.020 | Thank you.

00:20:25.020 | Thank you.

00:20:26.020 | Thank you.

00:20:26.520 | Thank you.

Alice 2: Building and Scaling an AI Agent During HyperGrowth | 11x | LangChain Interrupt