back to index

LangChain Interrupt 2025 Agents at Scale with LangGraph – David Tag


Whisper Transcript | Transcript Only Page

00:00:00.480 | It's really easy for teams to make production-ready Python services, and I think we've been pretty
00:00:06.400 | successful so far.
00:00:08.200 | Over 20 teams use this framework, and I think this obviously will be glorious at this point.
00:00:14.200 | Over 30 services have been created to support generating Python experiences on Python.
00:00:19.200 | And finally, at an entire architecture level, at a system level, we invest into new distributed
00:00:26.600 | architecture to specifically support HR2 modes of communication.
00:00:31.000 | So let's start from the ground up, and I'll talk about why we're just Python.
00:00:36.000 | So at a 10,000-foot view, these are the languages that we use for LinkedIn.
00:00:41.200 | Up until, I would say, late 2022, Python was really-- and I like to call it Python here.
00:00:47.600 | Python, we use mostly just for internal tooling, internal productivity tools, but also big data
00:00:53.600 | applications here, kind of high-smart, offline jobs.
00:00:57.000 | But really, Java was used to build up the vast, overwhelming majority of our business project.
00:01:03.000 | And so, come LinkedIn's 2022, which was really LinkedIn's first foray to Gendered AI, we saw
00:01:10.000 | that, hey, we're already using Java for non-gendered AI use cases.
00:01:14.000 | Let's just use Java for JA also.
00:01:16.000 | Some of you might be wincing, and I can already anticipate that.
00:01:21.400 | So at that time, we built some really basic apps.
00:01:24.400 | They are just simple prompts with basic prompt engineering, little to no rag, no conversational
00:01:30.400 | memory to speak of.
00:01:32.400 | And this was okay, and you probably predicted this part, but it was okay until it really wasn't.
00:01:39.400 | What we saw was a lot of teams, they wanted to experiment with Python.
00:01:45.800 | They wanted to use Python for prompt engineering for the evaluations, but because of our stack,
00:01:50.800 | they were forced to build Java and use that to build their services.
00:01:54.800 | So not only was that a problem, the more fundamental problem that we faced was how do you experiment
00:02:02.200 | with the open source libraries, the open source community, if your stack is fundamentally non-Python.
00:02:08.200 | And as soon as there is that kind of language gap, then it's really difficult for teams, honestly,
00:02:14.200 | to innovate and iterate the various techniques.
00:02:18.200 | It seemed like, you know, every month, every week, there would be a new model being released,
00:02:23.600 | a new library, a new prompt engineering technique, a new protocol.
00:02:27.600 | And I'm sure all of you are acutely aware of this problem, that it's just, frankly,
00:02:32.600 | pretty hard to keep track of all the developments that's happening in the industry.
00:02:36.600 | It seems like there's something new being developed on the line.
00:02:39.600 | So we took a step back and made a couple of key observations.
00:02:44.000 | First, you know, there's undeniable interest across the company from teams all across different
00:02:50.000 | verticals that they want to use Generative AI.
00:02:53.000 | We noticed that this Generative AI, or for Generative AI specifically, this Java Python setup
00:02:58.000 | really wasn't working out.
00:03:00.000 | And finally, you know, we have to use, we realized we have to use Python no matter what,
00:03:06.000 | because at the end of the day, you know, we have to be on top of the latest industry trends
00:03:10.400 | and make sure that we, you know, make sure we're creating those benefits in our second term.
00:03:16.400 | And so, we said, hey, let's make a bold bet.
00:03:19.400 | Let's use Python for the business logic, the engineering, the evals, the project,
00:03:23.000 | everything else that you would need to build a functional production application.
00:03:27.000 | And I was going to say we even took the next step further, which is, let's build a framework
00:03:31.800 | to make it the people, and specifically make a framework so that teams don't have to guess
00:03:38.200 | about what's the right way to do things.
00:03:40.200 | Instead, they can use this framework and take out some of the guesswork for them.
00:03:44.200 | So, let me talk about exactly that.
00:03:47.200 | What was a service framework that ended up building?
00:03:50.200 | At high level, we use Python GRPC, and we use LinkedIn and LinkGrab to model for business logic.
00:03:57.200 | This is what I talked about through GRPC, so I won't go too much into detail why we chose GRPC.
00:04:02.600 | But, at high level, I listed just some of the features here that we really appreciated,
00:04:07.200 | namely, it's built-in streaming support, it's really awesome.
00:04:10.600 | Binary serialization is a big performance boost, as well as just the native cross-language features,
00:04:15.600 | which, if you recall the couple of slides earlier, is a ton of features.
00:04:21.000 | So, how do you process your support was critical.
00:04:24.000 | So, here I built a sample reactive agent using our application framework.
00:04:30.000 | So, as you see here, there's standard utilities for tool calling, highlighted in kind of a yellow
00:04:35.500 | there.
00:04:36.500 | Standard utilities for large language model inference using the R to build infertile stack.
00:04:41.900 | Standard utilities for conversational memory and checkpointing.
00:04:45.900 | And the key thing that I'd like to note here is that we use LinkChain and LinkGraph to tie
00:04:49.900 | all of this stuff together.
00:04:51.900 | And really, LinkChain and LinkGraph form the core of the charge that we have.
00:04:56.900 | And then, this leads to the question of the hour is, why didn't we end up choosing LinkChain
00:05:01.300 | and LinkGraph or the CMO or different alternatives to build our element model or our applications?
00:05:08.300 | The first thing I'd like to say is that it just plainly is really easy to use, and I'm sure
00:05:12.300 | everyone in this would agree here, that even what we saw was that Java engineers are able
00:05:17.300 | to pick it up really easily.
00:05:18.300 | Like, if you look at the syntax, you can pretty easily identify what's happening.
00:05:22.300 | And furthermore, through various community integrations, just like a community FI supplementation
00:05:31.600 | or the React page and pre-built from the LinkGraph official repo, we were able to build logical
00:05:36.700 | apps days down weeks, which is, if you think about the number of teams that work on generic
00:05:41.700 | BI across the company, this is, you know, weeks, months of time that you can save your constant
00:05:45.700 | work.
00:05:46.700 | And the second thing that we really appreciated, and I think we've heard similar things from
00:05:51.700 | the speakers today, and that is the LinkChain package and LinkGraph package have really
00:05:57.100 | sensible interfaces.
00:05:58.100 | And so we're able to build our internal infrastructure using these interfaces.
00:06:03.000 | For example, if we look at this chat model interface here, LinkedIn uses Azure OpenAI, but also
00:06:09.500 | uses on-premise launching models as well.
00:06:11.500 | So teams, if they want to switch between model providers, they can do so with a couple lines
00:06:16.000 | of code, and that's that.
00:06:18.000 | So we zoom out on this diagram here.
00:06:21.400 | Essentially, what I'm describing allows us to build one of these boxes.
00:06:24.400 | But then, you might ask, how do we actually tie all these agents together?
00:06:28.400 | And so, the two problems that we identified, specifically with agentic setups, was that, one, agents can
00:06:36.800 | take a lot of time to process data.
00:06:38.800 | This leads into the whole ambient agent idea, where, how can we model long-running ease and
00:06:43.800 | risk flows within our infrastructure stack?
00:06:45.800 | And second, is agents can execute in parallel.
00:06:48.800 | And furthermore, the outputs of one agent might depend on the outputs of another agent.
00:06:54.200 | So how can we make sure that things are done in the right order?
00:06:57.200 | Does these mean to depend on this section of my topic?
00:07:01.200 | Which is the new infrastructure that we built, and I'll call it on an agent platform.
00:07:07.200 | So the first part of our solution was to model long-running asynchronous flows.
00:07:12.600 | We modeled this as precisely the messaging problem.
00:07:15.600 | And the reason why we chose messaging is because LinkedIn already has a really robust messaging
00:07:20.600 | service that serves countless members every day.
00:07:23.600 | And so, we extended this to, also, include pdf-informed communication, mainly agent-to-agent messaging,
00:07:30.000 | but also user-to-agent messaging as well.
00:07:32.000 | We even built a nice thing, so that there was, like, a near-line flow where messages can fail,
00:07:38.000 | that we automatically picked up and tried via the GMA system.
00:07:42.000 | So then, you know, that covers how agents can talk to each other.
00:07:47.000 | They can send messages to one another, but this doesn't cover how agents can actually make
00:07:52.400 | sure that tasks are done in fine order.
00:07:54.800 | So, the second part of this solution that we developed was to explicitly build memory catered
00:08:00.200 | for agents.
00:08:02.200 | And so, in ejecting memory, you'll probably see, I think you've seen things like this from people
00:08:08.600 | outside booths as well, is that there's kind of different forms of memory that the agent
00:08:13.400 | should be able to access.
00:08:15.400 | For us, it's memory is both scoped and layered.
00:08:17.400 | So, working memory, long-term memory, collective memory, and these each provide different functions.
00:08:22.200 | that the agent can utilize to do the things that the agents can do.
00:08:25.600 | So, for example, for a new interaction, you'll probably just fill out working memory.
00:08:30.000 | But over time, as the agent becomes more, has more interactions with a particular user or something like that,
00:08:36.600 | then more memory will be populated long-term memory over time.
00:08:39.600 | So, now we've covered the arrows and the boxes.
00:08:45.000 | So, how do agents actually do things?
00:08:47.000 | Well, this kind of leads to what I was talking about before, where we've built this notion of skills,
00:08:52.000 | which is like function calling in a lot of senses, but it deviates this from this notion in a couple of different ways.
00:08:59.000 | So, the first thing is that function calling is usually in a local tier or two box.
00:09:02.400 | But here, skills can be really unique.
00:09:04.400 | RPC calls, database queries, prompts, and more importantly, other agents.
00:09:10.400 | And the second thing that I'd like to emphasize here is the how.
00:09:13.400 | Specifically, we let agents invoke skills synchronously, but also asynchronously,
00:09:19.800 | which we consider the different themes that we talked about.
00:09:22.800 | The asynchronously part is absolutely critical.
00:09:25.800 | And I'd like to think that we did a good job with the design here, because overall,
00:09:29.800 | the setup is pretty similar to MCPE well before it's actually now.
00:09:33.800 | So, and finally, we took this skill concept and made it centralized.
00:09:39.800 | And so, we provided, we implemented a registry, so the team services could expose skills and register them in a central place,
00:09:47.800 | so that if an agent wants to access skills called by another team,
00:09:51.800 | they can do so and discover these skills via this central registry.
00:09:55.800 | Let me walk you all through what sample flow I've ever had.
00:09:59.200 | So, in this case, the supervisor, agent types of sourcing agent that I need help searching for the level engineer.
00:10:05.600 | The sourcing agent will contact the skill registry.
00:10:08.800 | The skill registry will respond with the skill that it thinks is a good fit for the task.
00:10:13.800 | And finally, the sourcing agent will execute this skill.
00:10:16.800 | And like I said before, this links to the core components that we've built, too.
00:10:21.800 | So, again, the emphasis means making it really easy for agents to develop.
00:10:25.800 | And lastly, I could spend an entire hour on this topic, but we also build customer durability,
00:10:30.800 | because agentic ways of execution require particular observables.
00:10:36.800 | To wrap up here, I'd like to cover two things.
00:10:40.800 | First is, I think a key emphasis that we've discovered these past couple years is really invest in productivity.
00:10:47.800 | This space is moving quite really fast.
00:10:49.800 | And if you want to make sure that you're kind of following best practices,
00:10:53.800 | but also available to adapt to different conditions or changes in the product,
00:10:57.800 | you need to make sure that it's really easy for developers to build.
00:11:00.800 | And so you can do this by standardizing patterns, making sure it's as easy for any developer to contribute as possible.
00:11:07.800 | The second thing I'd like to emphasize is, at the end of the day, we're still building production software.
00:11:11.800 | You should still consider the usual things, availability, reliability, reliability, but also, again,
00:11:16.800 | observability is paramount.
00:11:18.800 | You can't fix what you can't observe.
00:11:20.800 | And really, to do this, you need robust evaluations,
00:11:23.800 | and things that people talk about already.
00:11:25.800 | You should count on what you're going to support with them.
00:11:28.800 | So that's all.
00:11:29.800 | I'll end up. Thank you.
00:11:30.800 | Thank you, David.
00:11:43.800 | - This is a musical from Pedro and Brown.