back to indexLangChain Interrupt 2025 Agents at Scale with LangGraph – David Tag

00:00:00.480 |
It's really easy for teams to make production-ready Python services, and I think we've been pretty 00:00:08.200 |
Over 20 teams use this framework, and I think this obviously will be glorious at this point. 00:00:14.200 |
Over 30 services have been created to support generating Python experiences on Python. 00:00:19.200 |
And finally, at an entire architecture level, at a system level, we invest into new distributed 00:00:26.600 |
architecture to specifically support HR2 modes of communication. 00:00:31.000 |
So let's start from the ground up, and I'll talk about why we're just Python. 00:00:36.000 |
So at a 10,000-foot view, these are the languages that we use for LinkedIn. 00:00:41.200 |
Up until, I would say, late 2022, Python was really-- and I like to call it Python here. 00:00:47.600 |
Python, we use mostly just for internal tooling, internal productivity tools, but also big data 00:00:53.600 |
applications here, kind of high-smart, offline jobs. 00:00:57.000 |
But really, Java was used to build up the vast, overwhelming majority of our business project. 00:01:03.000 |
And so, come LinkedIn's 2022, which was really LinkedIn's first foray to Gendered AI, we saw 00:01:10.000 |
that, hey, we're already using Java for non-gendered AI use cases. 00:01:16.000 |
Some of you might be wincing, and I can already anticipate that. 00:01:21.400 |
So at that time, we built some really basic apps. 00:01:24.400 |
They are just simple prompts with basic prompt engineering, little to no rag, no conversational 00:01:32.400 |
And this was okay, and you probably predicted this part, but it was okay until it really wasn't. 00:01:39.400 |
What we saw was a lot of teams, they wanted to experiment with Python. 00:01:45.800 |
They wanted to use Python for prompt engineering for the evaluations, but because of our stack, 00:01:50.800 |
they were forced to build Java and use that to build their services. 00:01:54.800 |
So not only was that a problem, the more fundamental problem that we faced was how do you experiment 00:02:02.200 |
with the open source libraries, the open source community, if your stack is fundamentally non-Python. 00:02:08.200 |
And as soon as there is that kind of language gap, then it's really difficult for teams, honestly, 00:02:14.200 |
to innovate and iterate the various techniques. 00:02:18.200 |
It seemed like, you know, every month, every week, there would be a new model being released, 00:02:23.600 |
a new library, a new prompt engineering technique, a new protocol. 00:02:27.600 |
And I'm sure all of you are acutely aware of this problem, that it's just, frankly, 00:02:32.600 |
pretty hard to keep track of all the developments that's happening in the industry. 00:02:36.600 |
It seems like there's something new being developed on the line. 00:02:39.600 |
So we took a step back and made a couple of key observations. 00:02:44.000 |
First, you know, there's undeniable interest across the company from teams all across different 00:02:50.000 |
verticals that they want to use Generative AI. 00:02:53.000 |
We noticed that this Generative AI, or for Generative AI specifically, this Java Python setup 00:03:00.000 |
And finally, you know, we have to use, we realized we have to use Python no matter what, 00:03:06.000 |
because at the end of the day, you know, we have to be on top of the latest industry trends 00:03:10.400 |
and make sure that we, you know, make sure we're creating those benefits in our second term. 00:03:19.400 |
Let's use Python for the business logic, the engineering, the evals, the project, 00:03:23.000 |
everything else that you would need to build a functional production application. 00:03:27.000 |
And I was going to say we even took the next step further, which is, let's build a framework 00:03:31.800 |
to make it the people, and specifically make a framework so that teams don't have to guess 00:03:40.200 |
Instead, they can use this framework and take out some of the guesswork for them. 00:03:47.200 |
What was a service framework that ended up building? 00:03:50.200 |
At high level, we use Python GRPC, and we use LinkedIn and LinkGrab to model for business logic. 00:03:57.200 |
This is what I talked about through GRPC, so I won't go too much into detail why we chose GRPC. 00:04:02.600 |
But, at high level, I listed just some of the features here that we really appreciated, 00:04:07.200 |
namely, it's built-in streaming support, it's really awesome. 00:04:10.600 |
Binary serialization is a big performance boost, as well as just the native cross-language features, 00:04:15.600 |
which, if you recall the couple of slides earlier, is a ton of features. 00:04:21.000 |
So, how do you process your support was critical. 00:04:24.000 |
So, here I built a sample reactive agent using our application framework. 00:04:30.000 |
So, as you see here, there's standard utilities for tool calling, highlighted in kind of a yellow 00:04:36.500 |
Standard utilities for large language model inference using the R to build infertile stack. 00:04:41.900 |
Standard utilities for conversational memory and checkpointing. 00:04:45.900 |
And the key thing that I'd like to note here is that we use LinkChain and LinkGraph to tie 00:04:51.900 |
And really, LinkChain and LinkGraph form the core of the charge that we have. 00:04:56.900 |
And then, this leads to the question of the hour is, why didn't we end up choosing LinkChain 00:05:01.300 |
and LinkGraph or the CMO or different alternatives to build our element model or our applications? 00:05:08.300 |
The first thing I'd like to say is that it just plainly is really easy to use, and I'm sure 00:05:12.300 |
everyone in this would agree here, that even what we saw was that Java engineers are able 00:05:18.300 |
Like, if you look at the syntax, you can pretty easily identify what's happening. 00:05:22.300 |
And furthermore, through various community integrations, just like a community FI supplementation 00:05:31.600 |
or the React page and pre-built from the LinkGraph official repo, we were able to build logical 00:05:36.700 |
apps days down weeks, which is, if you think about the number of teams that work on generic 00:05:41.700 |
BI across the company, this is, you know, weeks, months of time that you can save your constant 00:05:46.700 |
And the second thing that we really appreciated, and I think we've heard similar things from 00:05:51.700 |
the speakers today, and that is the LinkChain package and LinkGraph package have really 00:05:58.100 |
And so we're able to build our internal infrastructure using these interfaces. 00:06:03.000 |
For example, if we look at this chat model interface here, LinkedIn uses Azure OpenAI, but also 00:06:11.500 |
So teams, if they want to switch between model providers, they can do so with a couple lines 00:06:21.400 |
Essentially, what I'm describing allows us to build one of these boxes. 00:06:24.400 |
But then, you might ask, how do we actually tie all these agents together? 00:06:28.400 |
And so, the two problems that we identified, specifically with agentic setups, was that, one, agents can 00:06:38.800 |
This leads into the whole ambient agent idea, where, how can we model long-running ease and 00:06:45.800 |
And second, is agents can execute in parallel. 00:06:48.800 |
And furthermore, the outputs of one agent might depend on the outputs of another agent. 00:06:54.200 |
So how can we make sure that things are done in the right order? 00:06:57.200 |
Does these mean to depend on this section of my topic? 00:07:01.200 |
Which is the new infrastructure that we built, and I'll call it on an agent platform. 00:07:07.200 |
So the first part of our solution was to model long-running asynchronous flows. 00:07:12.600 |
We modeled this as precisely the messaging problem. 00:07:15.600 |
And the reason why we chose messaging is because LinkedIn already has a really robust messaging 00:07:20.600 |
service that serves countless members every day. 00:07:23.600 |
And so, we extended this to, also, include pdf-informed communication, mainly agent-to-agent messaging, 00:07:32.000 |
We even built a nice thing, so that there was, like, a near-line flow where messages can fail, 00:07:38.000 |
that we automatically picked up and tried via the GMA system. 00:07:42.000 |
So then, you know, that covers how agents can talk to each other. 00:07:47.000 |
They can send messages to one another, but this doesn't cover how agents can actually make 00:07:54.800 |
So, the second part of this solution that we developed was to explicitly build memory catered 00:08:02.200 |
And so, in ejecting memory, you'll probably see, I think you've seen things like this from people 00:08:08.600 |
outside booths as well, is that there's kind of different forms of memory that the agent 00:08:15.400 |
For us, it's memory is both scoped and layered. 00:08:17.400 |
So, working memory, long-term memory, collective memory, and these each provide different functions. 00:08:22.200 |
that the agent can utilize to do the things that the agents can do. 00:08:25.600 |
So, for example, for a new interaction, you'll probably just fill out working memory. 00:08:30.000 |
But over time, as the agent becomes more, has more interactions with a particular user or something like that, 00:08:36.600 |
then more memory will be populated long-term memory over time. 00:08:39.600 |
So, now we've covered the arrows and the boxes. 00:08:47.000 |
Well, this kind of leads to what I was talking about before, where we've built this notion of skills, 00:08:52.000 |
which is like function calling in a lot of senses, but it deviates this from this notion in a couple of different ways. 00:08:59.000 |
So, the first thing is that function calling is usually in a local tier or two box. 00:09:04.400 |
RPC calls, database queries, prompts, and more importantly, other agents. 00:09:10.400 |
And the second thing that I'd like to emphasize here is the how. 00:09:13.400 |
Specifically, we let agents invoke skills synchronously, but also asynchronously, 00:09:19.800 |
which we consider the different themes that we talked about. 00:09:22.800 |
The asynchronously part is absolutely critical. 00:09:25.800 |
And I'd like to think that we did a good job with the design here, because overall, 00:09:29.800 |
the setup is pretty similar to MCPE well before it's actually now. 00:09:33.800 |
So, and finally, we took this skill concept and made it centralized. 00:09:39.800 |
And so, we provided, we implemented a registry, so the team services could expose skills and register them in a central place, 00:09:47.800 |
so that if an agent wants to access skills called by another team, 00:09:51.800 |
they can do so and discover these skills via this central registry. 00:09:55.800 |
Let me walk you all through what sample flow I've ever had. 00:09:59.200 |
So, in this case, the supervisor, agent types of sourcing agent that I need help searching for the level engineer. 00:10:05.600 |
The sourcing agent will contact the skill registry. 00:10:08.800 |
The skill registry will respond with the skill that it thinks is a good fit for the task. 00:10:13.800 |
And finally, the sourcing agent will execute this skill. 00:10:16.800 |
And like I said before, this links to the core components that we've built, too. 00:10:21.800 |
So, again, the emphasis means making it really easy for agents to develop. 00:10:25.800 |
And lastly, I could spend an entire hour on this topic, but we also build customer durability, 00:10:30.800 |
because agentic ways of execution require particular observables. 00:10:36.800 |
To wrap up here, I'd like to cover two things. 00:10:40.800 |
First is, I think a key emphasis that we've discovered these past couple years is really invest in productivity. 00:10:49.800 |
And if you want to make sure that you're kind of following best practices, 00:10:53.800 |
but also available to adapt to different conditions or changes in the product, 00:10:57.800 |
you need to make sure that it's really easy for developers to build. 00:11:00.800 |
And so you can do this by standardizing patterns, making sure it's as easy for any developer to contribute as possible. 00:11:07.800 |
The second thing I'd like to emphasize is, at the end of the day, we're still building production software. 00:11:11.800 |
You should still consider the usual things, availability, reliability, reliability, but also, again, 00:11:20.800 |
And really, to do this, you need robust evaluations, 00:11:25.800 |
You should count on what you're going to support with them.