back to indexLangChain Interrupt 2025 Building Our Digital Workforce with LangGraph – Assaf Elovic

00:00:00.080 |
In the Monday Ecosystem, on any given task in particular. 00:00:05.240 |
And what I'm going to show you today is very powerful lessons learned that we have in our 00:00:14.560 |
And it was said earlier today by the HRVA team and I think others that to build very successful 00:00:22.680 |
agents, you have to focus on product and user experience. 00:00:28.480 |
And we have a saying on Monday that "The biggest barrier to adoption is trust." 00:00:36.440 |
And I want to show you a few examples of things that we've learned. 00:00:41.600 |
So, when we think about autonomy, we think, you know, we're all engineers and we love to 00:00:49.400 |
think about autonomous agents and, you know, agents doing everything around the clock, but 00:00:56.960 |
We know how our users are using agents and what they think. 00:00:58.920 |
Imagine that every company, every user has a different risk of data. 00:01:03.420 |
And when you build AI agents, you should give users a control. 00:01:08.420 |
And what we've learned by applying this is that we've actually increased adoption in the same 00:01:14.420 |
way, by giving the control and the users have to decide how they want to control their agents. 00:01:20.880 |
Now, if you're building a startup from scratch, that's something else. 00:01:25.880 |
But as a huge company like Monday, one thing that we've learned is don't rebuild and be new 00:01:32.380 |
Try to think how you can create these experiences within your existing products. 00:01:36.880 |
So, when you think about how agents can work at Monday, we already have people working at Monday, 00:01:51.800 |
Just think about how you can assign original workers or agents to actual tasks. 00:01:56.800 |
And by doing that, our users have no new habits that they have to learn. 00:02:05.260 |
Another super important thing that we've learned. 00:02:08.260 |
So, originally, when we released these agents, imagine that you can ask in the chat and say 00:02:15.260 |
things like, create this board, create this project, modify this item. 00:02:22.260 |
So, for our users, Monday boards are production data. 00:02:28.220 |
I think a very good example I'd like to give is think about Cursor AI, which is an amazing 00:02:34.260 |
We all buy the code and introduce it earlier. 00:02:38.720 |
But imagine if it's Cursor AI, instead of you as developers seeing the code, imagine it 00:02:46.100 |
I assume that none of you, maybe most of you, would have not used it. 00:02:50.500 |
And that is just how important user experience is, because technology, technology-wise, you 00:02:58.460 |
And what we did is that we saw users onboarding, testing them out. 00:03:05.460 |
And once came the time to actually push content to the board, that's where they chose. 00:03:14.420 |
And this preview increased adoption by insane. 00:03:19.420 |
Because users now have the confidence that they know what's going to have this guardrail 00:03:26.420 |
And they know what's going to be the outpost before they see it save it. 00:03:30.420 |
So, when you think about building AI experiences, think about previews. 00:03:36.380 |
Think about how users can have a control and understanding before AI releases the production. 00:03:51.340 |
Think about expandability as a way for your users to learn how to improve their experience 00:04:09.300 |
Because when they have an understanding of why the outputs happen, they have an ability 00:04:17.460 |
So, these four are super important components that you actually produce in your products that 00:04:31.260 |
So, we actually built our entire tech system of our agents on LangGraph and LangSmith. 00:04:38.260 |
And we've tested out various frameworks and we found LangGraph to be the number one by far. 00:04:47.220 |
So, what's great about LangGraph is that it's not really appealing, but it still does everything 00:04:56.220 |
Like interrupts and checkpoints, persistent memory, even the loop. 00:05:04.220 |
Those are critical components that we don't want to deal with, but we have that. 00:05:07.220 |
On the other hand, we have super great options to customize it just for what we need. 00:05:19.140 |
And, initially, native integration, we now process millions of requests per month using LangGraph 00:05:27.140 |
So, let's take a look at how this is handled. 00:05:32.140 |
So, we have LangGraph as the center of everything we're building. 00:05:38.100 |
And, around our LangGraph engine, which uses also LangGraph and LangSmith for monitoring, 00:05:44.060 |
we also have what we have built as what we call AI blocks, which is basically internal AI actions 00:05:52.060 |
We've actually built our own evaluation framework because we believe that evaluation is one of the most important aspects 00:06:00.020 |
And, I think that enables a lot of that evaluation as you can see. 00:06:06.020 |
And then, we also have our AI gateway, which is our way of preserving what kind of inputs and outputs are enabled in the system. 00:06:14.020 |
Now, let's take an example of our first digital workflow that we released, which is the Monday Expert. 00:06:21.980 |
So, basically, what you see here is a conversational agent using the supervisor methodology 00:06:30.980 |
that our system involves four different agents. 00:06:35.980 |
We have a supervisor with a data retrieval agent, which is in charge of retrieving all data across Monday. 00:06:42.980 |
For example, knowledge base, board data, we also use web search. 00:06:47.940 |
Then, we have our board actions agent that does actual actions on Monday. 00:06:54.940 |
And, lastly, we have the answer composer that, based on the user, the past conversations, tone of voice, 00:07:01.940 |
and all kind of other parameters that are defined by the Monday user, actually composes the final answer. 00:07:09.940 |
And, we've even added a really awesome tool that we've learned, which is called Undo. 00:07:14.900 |
So, basically, we gave the supervisor the ability to dynamically decide what to undo within the actions based on the user feedback. 00:07:23.900 |
Which is, by the way, proved to be one of the coolest use cases for building. 00:07:28.900 |
And, I want to share a bit of our lessons learned as we build this agent in privacy. 00:07:39.860 |
So, when you build a conversational agent, assume that 99% of user interactions, you have to know how to handle them. 00:07:51.860 |
And, it's proven statistically right when you think about the innocent amount of things users can ask. 00:08:00.820 |
And, for this, we learned to start with what happens in 99% of interactions that we don't know how to handle them. 00:08:11.780 |
So, for example, what we did was, if we detect the user's asking some action that we don't know how to handle them, 00:08:19.780 |
So, we would search our knowledge base and give them an answer for how they can do it themselves. 00:08:23.700 |
This is an example of one way of resolving solving solving. 00:08:28.740 |
We've talked so much today, so if I think about Microsoft VALs, it's the VALs of your IT. 00:08:35.700 |
Because, models change, technology is going to change so much over the next few years. 00:08:40.700 |
But, if you have a very strong evaluation, that is your IT. 00:08:45.700 |
That will allow you to move much faster than your competitors. 00:08:48.700 |
Even the loop, critical, we talked about this a lot at the beginning. 00:08:54.700 |
So, for those who have really shipped AI to production, I think you've seen that it's one thing to bring AI to about 80%, 00:09:06.660 |
but then it takes another year to get to 99%. 00:09:10.660 |
And, this is a very important rule, because we really felt confident when we were working locally, once we shipped the production, 00:09:19.660 |
we realized how far we are from the actual product. 00:09:22.660 |
I see some of the aliens resonate with me on that one. 00:09:26.620 |
All guardrails, we highly recommend that you build outside the element. 00:09:36.620 |
We've seen things like the element of the judge. 00:09:40.620 |
By the way, I think cursor is such a great example for a way to build good product experience. 00:09:45.620 |
because I want you guys, especially with light coding, after 25 rounds it stops. 00:09:52.580 |
This is an external guardrail you put in, no matter if it's actually really successful. 00:09:58.580 |
Just think about how you can bring those guardrails outside the element. 00:10:03.580 |
And then lastly, and this is a very interesting one, is that it might be obvious that it's smart to break your agent into sub-agents, right? 00:10:15.540 |
Obviously, when you have specialized agents, they work better. 00:10:19.500 |
But what we've learned is that there is a very important balance, because when you have too many agents, what happens is what we like to call compound hallucination. 00:10:29.500 |
So basically, it's a mathematic problem, right? 00:10:35.460 |
90% accuracy times 90% accuracy, second agent times the third time, the fourth. 00:10:41.460 |
Even if they're all at 90%, we're now at 70%. 00:10:44.460 |
So, and it's a mathematical, it's proven, it's a mathematical, right? 00:10:49.460 |
So, I think there's a very strong balance between how much of agents you want in your multi-agent system, versus having too much or too little. 00:10:59.420 |
And it's something that I think there's no, like, rule of thumb. 00:11:02.420 |
It's something you have to iterate on based on your business. 00:11:09.420 |
And we believe that the future of work, and what we're working on Monday, is all about illustration. 00:11:21.380 |
So, this is a real use case that we try to work on internally. 00:11:25.380 |
We just had our earnings report, just a few days ago. 00:11:29.380 |
And, for those of you working in large public companies, you're probably, or if you're involved in these reports, it's a tedious process. 00:11:40.380 |
There's so much data, narrative, information, across a company together, so many people involved. 00:11:47.380 |
So, we said, "What if we automate this? What if we had a way to automate an entire workflow that would automatically create everything we need for earnings?" 00:11:58.340 |
But, there's one problem with this, and the problem is that it will only run one support. 00:12:05.340 |
We invest the entire month building an amazing workflow, and then we run it once, and the next time we run, AI is going to change dramatically. 00:12:17.300 |
New models are going to come out, everything is going to change in the workflow, and then we have to rebuild everything. 00:12:22.300 |
So, you got us thinking about how can we solve this. 00:12:25.300 |
So, I want you to imagine, what if there was a finite set of agents that could do infinite amount of tasks? 00:12:35.260 |
Now, the irony is that this is not some big trick, this is exactly how we work as humans, right? 00:12:41.220 |
When you think about us, we each have our specialized skills, some are engineers, some are data analysts, 00:12:49.220 |
and then, every time there is a task at work, some of us do A and some of us do B. 00:12:55.180 |
So, there's no reason why we shouldn't work with agents and AI. 00:13:01.180 |
So, when we think about the future, we think about what we see here. 00:13:08.180 |
Imagine, that for the same hidden task that we had, I showed you earlier, we had a dynamic way to orchestrate a very dynamic workflow 00:13:18.180 |
with dynamic edges and dynamic rules, choosing dynamic, very specific agents that are perfect for the task, 00:13:33.140 |
So, this is super exciting and one of the things that we are working on with LinkChain, 00:13:38.140 |
and we really want to see this come to life in the future. 00:13:43.140 |
So, lastly, we're actually opening our marketplace of agents to all of you, 00:13:49.140 |
and we'd love to see you join the waitlist and join us in building and trying to tackle this one billion tests 00:13:57.140 |
So, thank you very much, everyone. It's a pleasure. Thank you.