back to indexLangChain Interrupt 2025 Uber Agentic Developer Products with LangGraph – Sourabh Shirhatti

00:00:00.000 |
What we think we can make the most important role is always learning, that's why we're here, see what everyone else is up to and see what else we can target. 00:00:12.000 |
And lastly, what I was going to say is the kind of pillar of our strategy is we don't want to build the right, we call it cross-current derivatives. 00:00:19.000 |
There's a lot of traditional analytics that pretty much align all your solutions, your property dealer too, and having the right abstractions in place, the right frameworks, like tooling, helps us build more solutions and build them faster. 00:00:36.000 |
And lastly, what I was going to say is probably the cornerstone of this strategy is what we call intentional background stuff. 00:00:42.000 |
We've taken a better on a few product areas, we want to build them, we want to build them as fast as possible, but we do stop and be deliberate about, hey, what here is useful? 00:00:52.000 |
What can we spun out into something that reduces the value for the next problem we want to solve? 00:00:58.000 |
And so, LandEffects is our integrated framework we built, that helps, like LandDrop and LandChain, it makes it work better with Uber systems. 00:01:10.000 |
We had the first couple of products emerge, and they wanted us all. 00:01:14.000 |
The problems in HR, they expanded it, they wanted us to build a new system in the north, and LandDrop was the first thing to do it. 00:01:21.000 |
Because we saw this proliferating from the organization, we made it available, and we built a new data framework around it. 00:01:28.000 |
So, you know, I think enough of the view, let's dive into one of the products, one of the students validate it. 00:01:35.000 |
So, the first product showcases the ASL validator. 00:01:38.000 |
Now, what it is, is that the NIDs, like I said, blackout, dash-backed discrimination for security issues for engineers, and code, automatically. 00:01:47.000 |
So, it is effectively a man-ground region that pulls a nice ID U.S. around. 00:01:53.000 |
And, you know, let's take a look at how it works. 00:01:55.000 |
So, we have this feature here, that shows a user opening build file. 00:02:01.000 |
And, what they have in there is, they notified of a violation in this case. 00:02:05.000 |
So, they have a little bit of a typos that they can mouse over. 00:02:08.000 |
And they get a nice photo, saying, "Hey, in this case, you're using the incorrect method to create a temporary test file. 00:02:16.000 |
You know, this will leak into closing, and you want to have it automatically get up for you." 00:02:25.000 |
They can apply a pre-computed fix that they have to share for them in the background. 00:02:30.000 |
Or, if they choose so, they can ship off the fix to their ID identity system. 00:02:34.000 |
So, that's what we have in the text line, actually. 00:02:37.000 |
The fix provides them to ship out, and we log back with the fix from the ID. 00:02:41.000 |
So, the issue is no longer present, and the user tagging into the issue is resolved. 00:02:49.000 |
Some of the key ideas that we found out about building this. 00:02:53.000 |
The main thing is that the agent abstraction allows us to actually compose multiple sub-agents under a central validator, for example. 00:03:01.000 |
So, we have a, you know, sections, a sub-agent for validator that calls into that a list of practices and sort of gets those points of feedback. 00:03:13.000 |
But, there's also an interesting fit, where, for example, we want to discover link issues from data vendors. 00:03:19.000 |
So, if there's nothing stopping us from running a link tool, and then passing on to the rest of the graph, that allows us to, you know, 00:03:28.000 |
And, in terms of impact, you know, we've seen thousands of fixed interactions today from data side engineers that fix their problems in code before they come back later to find them. 00:03:39.000 |
And, I think, you know, we think we've built a compelling experience today. 00:03:43.000 |
Like, we met developers where they are in the ID team. 00:03:48.000 |
We just combine, you know, we don't use the capabilities like we use case part of Google. 00:03:56.000 |
We're able to evaluate issues against a set of curated best practices, lack of violations. 00:04:02.000 |
And, comfortably, the most expressive way to deliver this back to the user. 00:04:17.000 |
Let's help engineers by welcoming their tests from the get-off. 00:04:20.000 |
Now, you know, the second thing we're showing you up here is called AutoCover. 00:04:24.000 |
And, it is a tool to help engineers build, or generate, rather. 00:04:27.000 |
Building, passing, coverage raising, business case testing. 00:04:32.000 |
And, you know, validate and implementation testing tests. 00:04:35.000 |
So, like, really high quality tests, what we're showing you for here. 00:04:37.000 |
And, the intent is to save the engineer time. 00:04:40.000 |
And, you want to get there just quickly, and move on to the next business feature that you 00:04:44.000 |
So, the way we got to do this is, actually, we took a bunch of experts doing expeditions. 00:04:50.000 |
We actually threw in the validator there, as well, and warned that later. 00:04:58.000 |
We have a screenshot of, you know, a source graph, as an example. 00:05:02.000 |
And, the user can, you know, invoke it in a lot of other ways. 00:05:06.000 |
If they want to open for the whole file, and sort of both generate, they can do a right-click, 00:05:09.000 |
as shown in the screenshot, and just invoke it. 00:05:14.000 |
What happens next is a whole bunch of stuff happens in the background. 00:05:17.000 |
So, we start with adding a new target to the built system. 00:05:22.000 |
We run an initial coverage check to get a sort of target space for us to operate on. 00:05:27.000 |
All while that is being done, we also analyze the surrounding source to get the business markets out, 00:05:36.000 |
And, what the user sees really is just, they get switched to an empty test file. 00:05:41.000 |
And then, because we did all that stuff in the background, we're starting to already generate tests. 00:05:46.000 |
And, what the user will see is, there's a stream of tests coming. 00:05:52.000 |
There will be tests coming in at a fast speed. 00:05:59.000 |
Some tests might get removed because they're redundant. 00:06:02.000 |
You might see benchmark, like concurrency tests come in later. 00:06:06.000 |
And so, you know, the user is sort of watching this experience. 00:06:10.000 |
And then, at the end, we're running a nice set of validated beta tests. 00:06:20.000 |
Let's dive a bit deeper into the graph here to see how it actually functions. 00:06:25.000 |
Now, on the bottom right, you can actually see value there, which is the same agent that 00:06:31.000 |
So, you can already see some of the composability learnings that we found useful. 00:06:41.000 |
We look at the sort of heuristics that an engineer would use while writing tests. 00:06:46.000 |
And, so for example, you want to prepare a test environment. 00:06:49.000 |
You want to think about which business, in this case, to test. 00:06:56.000 |
Whether it be for extending existing tests or just writing new tests altogether. 00:07:01.000 |
And then you want to run your builds, your tests. 00:07:03.000 |
And then if you, you know, those are passing, you want to run a coverage check to see what 00:07:09.000 |
And so, we go on to, you know, complete the graph this way. 00:07:13.000 |
And then because we don't, no longer have anyone involved, we can actually supercharpe the graph. 00:07:17.000 |
Sort of juice it up so that we can do a hundred iterations of code generation at the same time. 00:07:22.000 |
And another hundred executions at the same time. 00:07:24.000 |
We've seen, you know, for a sufficiently large source file, you can do that. 00:07:28.000 |
And that's sort of where our key learning comes in. 00:07:30.000 |
We found that having these super capable domain expert agents gives us unparalleled performance. 00:07:37.000 |
Sort of exceptional performance compared to other agendic coding tools. 00:07:40.000 |
We benchmarked with the industry agendic coding tools that are available for test generation. 00:07:45.000 |
And we get about two, three times more coverage in about half the time compared to them. 00:07:51.000 |
because of the speed ups that we did in creating this graph view. 00:07:55.000 |
And sort of the custom bespoke knowledge that we built into our agency. 00:07:59.000 |
And in terms of impact, we have, the school has helped raise the whole developer platform coverage 00:08:08.000 |
So that maps to about 21,000 that are saved, which we're super happy about. 00:08:11.000 |
And we're saying continue to use of thousands of test generated monthly. 00:08:18.000 |
I'm sorry, take us through some more questions. 00:08:21.000 |
Yeah, so we don't want to stop it, but I would just mean, like, we built this primitive, right? 00:08:24.000 |
We're going to give you a sneak peek of what else we've been able to do when we organize this with this. 00:08:28.000 |
So what you see on the screen right now is our hoover assistant builder. 00:08:31.000 |
Think of it like our internal custom GPT store where you can build jackpots that are, you know, steep to hoover knowledge. 00:08:38.000 |
So, like, one of them you see on the screen is the security score bar. 00:08:41.000 |
And it has access to some of the same tools that we showed you. 00:08:45.000 |
You know, it's conceived of who's best practices. 00:08:50.000 |
So even before I get to the point of I'm getting my ID right forward, I can ask questions about architecture, 00:08:55.000 |
figure out whether my implementation is secure or not, right? 00:08:58.000 |
Same primitives, power, different experience. 00:09:03.000 |
Picasso is our internal workflow management platform. 00:09:06.000 |
And we build a contractual AI, as well as GD. 00:09:14.000 |
And it can give you feedback, grounded in product, through, like, aware of what the product does. 00:09:20.000 |
The other thing I want to show you, and this is not an exhaustive list, right? 00:09:29.000 |
We try and flag at the values earlier in the process. 00:09:35.000 |
You know, why not reinforce the major qualities and force before, you know, 00:09:39.000 |
what gets landed, before your VR gets merged. 00:09:42.000 |
So, again, Power could some of the same tools that you saw earlier in Power, like Validator, 00:09:48.000 |
We're able to flag, you know, both of you comments and both suggestions that developers can apply 00:09:56.000 |
I think with that, we'll just jump over to the learnings. 00:10:00.000 |
So, in terms of the learnings, we already sort of talked about this, 00:10:04.000 |
but we found that building domain expeditions that are streamable 00:10:07.000 |
are actually the way to go to get outside results. 00:10:16.000 |
And then, you know, the outgoing result is much better. 00:10:19.000 |
So, an example that I already talked about is the execute agent. 00:10:23.000 |
So, we're able to connect our load system to allow us to, on the same file, execute the 00:10:28.000 |
hunting tests, on the same test map, without writing, and then also get separate coverage 00:10:33.000 |
That's an example of a domain expeditions that's super capable and gives us that performance 00:10:37.000 |
Secondly, we found that when possible, composing agents with deterministic subagents, or just 00:10:44.000 |
have the whole agent deterministic, makes a lot of sense. 00:10:46.000 |
If you can solve the problem, you have deterministic weight. 00:10:48.000 |
So, you know, one example of that is the lit-agent undervalidator. 00:10:55.000 |
And if we have deterministic tools that we get to give out that intelligence, we need to 00:11:00.000 |
allow, we can have that reliable, often pass on the learnings to the rest of the graph 00:11:06.000 |
And then, third, we found that we can scale up our data efforts quite a bit by solving a 00:11:11.000 |
problem by creating an agent and then using it in multiple applications. 00:11:16.000 |
So, you already saw it with the statin-blown experience and the value of the part of our test 00:11:23.000 |
But I'm going to give you one more lower-level example. 00:11:26.000 |
That's actually used for both of the products. 00:11:29.000 |
That's the lower-level abstraction that is required for us to be able to, you know, have 00:11:34.000 |
the agents be able to execute builds and execute tests in our build system. 00:11:39.000 |
So, Sorak, I think it's supposed to be a strategic learning. 00:11:43.000 |
So, I might have talked to us about some of the tech benefits, but this is the one I'm 00:11:48.000 |
Like, you can set up your organization for success if you want to build a agent with AI. 00:11:53.000 |
I think we've done a pretty good job of it at Cooper. 00:11:58.000 |
We're all building in collaboration and I think these are our biggest takeaway. 00:12:02.000 |
The third thing is, you know, encapsulation with collaboration. 00:12:07.000 |
So, when there are well-thought-out abstractions like Landgraf and there are opinions on how 00:12:13.000 |
to do things like handle, state management, how to deal with the currency, it really allows 00:12:21.000 |
Let's just tackle more problems and more complex problems without creating this operational model 00:12:29.000 |
An example I'll give you is, our security team was able to guide tools for validated like the 00:12:37.000 |
The security team knew nothing about this part of security. 00:12:40.000 |
They knew nothing about AI agents and how they're constructed. 00:12:44.000 |
But they were still able to add value into the large part of all the works. 00:12:48.000 |
And so, like a natural segue from that is if you're able to encapsulate, you know, work into these well-defined 00:12:55.000 |
Then, like graphs are the next thing to think about, right? 00:12:58.000 |
Like graphs are the next thing to think about, right? 00:13:00.000 |
Like graphs model these interactions perfectly. 00:13:02.000 |
They often have to mirror how developers already interact with the system. 00:13:07.000 |
So, when we do the processing process engineering and identify process bottlenecks of inefficiency, 00:13:14.000 |
it doesn't just help accelerate or boost the AI workflows. 00:13:18.000 |
It also helps improve the experience for people not even interacting with the AI tools, right? 00:13:23.000 |
So, it's not like a haunt or should we relate to this or should we improve our existing system. 00:13:29.000 |
it usually segments into, like, helping each other. 00:13:32.000 |
Like, you know, we talk about our agenting test generation and we find multiple inefficiencies 00:13:38.000 |
through, like, how are you doing mock generation quickly? 00:13:41.000 |
How do you modify build files that will, like, interact with the build system? 00:13:48.000 |
And in the process of, like, fixing all these paper cuts, we improve the experience for just, like, 00:13:55.000 |
non-TAG applications for developers to interact with our systems. 00:14:03.000 |
And, you know, with that, I want to bring this up to Penny. 00:14:10.000 |
Hopefully, you all learned something and we'll take something back to your documents.