back to indexWhat's next for AI agents ft. LangChain's Harrison Chase
00:00:05.640 |
You know, one of the reasons I was really excited to come back today was because I think 00:00:09.800 |
it was a year ago at this event that I met Harrison, and I thought, "Boy, if I get to 00:00:14.420 |
meet super cool people like Harrison, I'm definitely going to come back this year." 00:00:24.760 |
Those of you that don't, you know, pull up your laptop, run pip install LangChain. 00:00:26.600 |
If you aren't using LangSwift yet, I'm a huge fan. 00:00:29.200 |
And Harrison works a massive developer community. 00:00:31.600 |
If you look at the pip, you know, PiPi download stats, I think LangChain is by far the leading 00:00:37.200 |
generative AI orchestration platform, I think. 00:00:40.180 |
And this gives us a huge view into a lot of things happening in generative AI. 00:00:43.800 |
So I'm excited to have him share with us what he's seeing with AI agents. 00:00:53.980 |
So LangChain is a developer framework for building all types of LLM applications. 00:00:57.560 |
But one of the most common ones that we see being built are agents. 00:01:00.900 |
And we've heard a lot about agents from a variety of speakers before. 00:01:06.200 |
So I'm not gonna go into too much of a deep kind of, like, overview. 00:01:11.160 |
But at a high level, it's using a language model to interact with the external world 00:01:19.040 |
And so tool usage, memory, planning, taking actions is kind of the high level gist. 00:01:25.160 |
And the simple form of this you can maybe think of as just running an LLM in a for loop. 00:01:34.880 |
And then you keep on doing that until it decides it's done. 00:01:37.440 |
So today I want to talk about some of the areas that I'm really excited about that we 00:01:42.240 |
see developers spending a lot of time in and really taking this idea of an agent and making 00:01:48.480 |
it something that's production ready and real world and really, you know, the future of 00:01:56.100 |
So there's three main things that I want to talk about. 00:01:57.920 |
And we've actually touched on all of these in some capacity already. 00:02:02.880 |
So planning, the user experience, and memory. 00:02:06.800 |
So for planning, Andrew covered this really nicely in his talk. 00:02:12.880 |
The basic idea here is that if you think about running the LLM in a for loop, oftentimes 00:02:17.960 |
there's multiple steps that it needs to take. 00:02:20.280 |
And so when you're running it in a for loop, you're asking it implicitly to kind of reason 00:02:24.760 |
and plan about what the best next step is, see the observation, and then kind of like 00:02:29.640 |
resume from there and think about what the next best step is right after that. 00:02:35.880 |
Right now at the moment, language models aren't really good enough to kind of do that reliably. 00:02:41.080 |
And so we see a lot of external papers and external prompting strategies kind of like 00:02:46.200 |
enforcing planning in some method, whether this be planning steps explicitly up front 00:02:53.360 |
or reflection steps at the end to see if it's kind of like done everything correctly as 00:02:59.560 |
I think the interesting thing here, thinking about the future, is whether these types of 00:03:03.760 |
prompting strategies and these types of cognitive architectures continue to be things that developers 00:03:09.320 |
are building or whether they get built into the model APIs, as we heard Sam talk a little 00:03:15.640 |
And so for all three of these, to be clear, I don't have answers. 00:03:19.560 |
And so one of my questions here is, are these planning, prompting things short-term hacks 00:03:30.680 |
Another kind of like aspect of this is just the importance of basically flow engineering. 00:03:35.120 |
And so this term I heard come out of this paper, Alpha Codium. 00:03:38.440 |
It basically achieves state-of-the-art kind of like coding performance, not necessarily 00:03:42.240 |
through better models or better prompting strategies, but through better flow engineering. 00:03:45.700 |
So explicitly designing this kind of like graph or state machine type thing. 00:03:51.440 |
And I think one way to think about this is you're actually offloading the planning of 00:03:55.040 |
what to do to the human engineers who are doing that at the beginning. 00:03:58.920 |
And so you're relying on that as a little bit of a crutch. 00:04:03.000 |
The next thing that I want to talk about is the UX of a lot of agent applications. 00:04:06.720 |
This is actually one area I'm really excited about. 00:04:08.440 |
I don't think we've kind of nailed the right way to interact with these agent applications. 00:04:13.520 |
I think human in the loop is kind of still necessary because they're not super reliable. 00:04:19.960 |
But if it's in the loop too much, then it's not actually doing that much useful thing. 00:04:23.760 |
So there's kind of like a weird balance there. 00:04:26.040 |
One UX thing that I really like from Devin, which came out a week, two weeks ago, and 00:04:32.920 |
Jordan B kind of put this nicely on Twitter, is the presence of a rewind and edit ability. 00:04:39.480 |
So you can basically go back to a point in time where the agent was and then edit what 00:04:44.720 |
it did or edit the state that it's in so that it can make a more informed decision. 00:04:48.400 |
And I think this is a really, really powerful UX that we're really excited about at LingChain 00:04:54.640 |
And I think this brings a little bit more reliability, but at the same time kind of 00:05:02.880 |
And speaking of kind of like steering ability, the last thing I want to talk about is the 00:05:08.960 |
And so Mike at Zapier showed this off a little bit earlier where he was basically interacting 00:05:13.960 |
with the bot and kind of like teaching it what to do and correcting it. 00:05:17.360 |
And so this is an example where I'm teaching in a chat setting an AI to kind of like write 00:05:24.040 |
And so you can see that I'm just correcting it in natural language to get to a style that 00:05:29.560 |
The next time I go back to this application, it remembers the style that I want. 00:05:34.480 |
I can keep on making it a little more differentiated. 00:05:37.040 |
And then when I go back a third time, it remembers all of that. 00:05:39.480 |
And so this I would kind of classify as kind of like procedural memory. 00:05:42.840 |
So it's remembering the correct way to do something. 00:05:45.320 |
I think another really important aspect is basically personalized memory. 00:05:49.680 |
So remembering facts about a human that you might not necessarily use to do something 00:05:54.100 |
more correctly, but you might use to make the experience kind of like more personalized. 00:05:58.720 |
So this is an example kind of like journaling app that we are building and playing around 00:06:04.340 |
And you can see that I mentioned that I went to a cooking class, and it remembers that 00:06:08.440 |
And so I think bringing in these kind of like personalized aspects, whether it be procedural 00:06:13.460 |
or kind of like these personalized facts, will be really important for the next generation 00:06:23.280 |
And excited to chat about all of this, if anyone wants to chat about this after.