back to indexMCP at Sourcegraph | Code w/ Claude

Chapters
0:0 Introduction
1:55 New era of AI
4:25 AMP demo
11:40 AMP architecture
12:43 Recipe for AI agents
15:4 MCP tools
17:22 Secure MCP
18:45 Future of MCP
00:00:16.240 |
I'm the CTO and co-founder of a company called Sourcegraph. 00:00:18.840 |
There might be some of you who have used our products 00:00:23.040 |
We build developer tools for professional engineers 00:00:28.160 |
And for those of you who haven't heard of us, 00:00:31.720 |
I think it's like seven of the top 10 software engineering 00:00:34.380 |
companies by market cap, and six of the top 10 US banks, 00:00:37.600 |
and just multitudes of companies building software 00:00:41.080 |
and writing code across basically every industry vertical. 00:00:45.260 |
And today, I'm here to talk about our journey with MCP, 00:00:50.080 |
and in particular, how we're integrating the model context 00:00:54.760 |
protocol deeply into the fabric of our architecture. 00:00:58.920 |
So our journey actually began quite some time ago. 00:01:01.820 |
It was actually summer of last year when this fellow, David, 00:01:10.660 |
He reached out and said, hey, I heard you guys are doing a lot 00:01:17.920 |
and fetching the appropriate context in the context window 00:01:20.500 |
and getting models to perform better on coding 00:01:25.340 |
We're working on a thing that you might find interesting. 00:01:27.960 |
And we're like, huh, that sounds interesting. 00:01:30.320 |
And he was like, it's kind of like LSP, but for model context. 00:01:34.660 |
And we're like, wow, that does sound interesting. 00:01:36.840 |
And so we started chatting, and we ended up becoming 00:01:39.080 |
one of the early design partners for the MCP protocol 00:01:46.600 |
And it was really a privilege to work with David and the team 00:01:50.260 |
to kind of guide the evolution of that protocol. 00:02:02.080 |
with the developer tools that we're building, 00:02:05.900 |
we soon came to this realization that, holy crap, 00:02:09.520 |
like AI is changing everything, but everything 00:02:15.320 |
And specifically, we felt that tool calling models 00:02:23.620 |
to another paradigm shift in the standard AI application 00:02:30.440 |
like three waves of AI application architecture so far. 00:02:34.580 |
The first wave was sort of the co-pilot wave, 00:02:37.440 |
where basically the architecture of those applications 00:02:40.780 |
was dictated by the capabilities of the first LLMs 00:02:44.880 |
So if you go back to the ancient year of 2022 00:02:54.680 |
tuned to respond in a chat fashion or use tools. 00:02:58.320 |
They were just these kind of like text completion models. 00:03:02.900 |
saw built on top of AI followed this paradigm. 00:03:07.240 |
and then the model would complete the next couple 00:03:09.320 |
of tokens, and the human would type some more. 00:03:10.940 |
And that was kind of the interaction paradigm. 00:03:13.380 |
And then ChatGPT came along, and that ushered in a new modality. 00:03:20.840 |
able to chat with this thing and make explicit asks 00:03:23.680 |
And one of the things that we soon realized in that world 00:03:25.940 |
was like, hey, if you copy and paste relevance host 00:03:28.340 |
snippets into the context window and then ask it 00:03:30.160 |
to answer the question, it gets a lot better in terms 00:03:32.100 |
of quality and usefulness on production code bases. 00:03:34.860 |
And that's what I like to call the rag chat era of AI. 00:03:45.740 |
I think we are all aware that we're kind of entering, 00:03:57.000 |
this era is really being dictated by the capabilities 00:04:01.680 |
And so when we took a look at our tooling suite 00:04:04.900 |
that we had built so far, the more we looked, 00:04:11.460 |
of the underlying assumptions of building on top of LLMs 00:04:13.580 |
have changed with tool calling agents and MCP. 00:04:16.680 |
We might have to rethink this application from the ground up 00:04:27.020 |
We built a completely new coding agent called AMP 00:04:32.280 |
And I'll show off to you what it's able to do. 00:04:36.040 |
And I think the best way to talk about how we've 00:04:44.740 |
that Anthropics have been shipping in conjunction 00:04:47.000 |
with the model context protocol is to show you AMP in action 00:04:49.920 |
and show you it using a bunch of tools to complete tasks. 00:04:57.780 |
And I think this might be the longest live demo of the day. 00:05:01.280 |
So all the prayers that you said for Brad and other folks, 00:05:11.220 |
So I have described the change I want in this linear issue. 00:05:23.120 |
is we're going to change the background panel of AMP 00:05:37.340 |
And I'm just going to have it implement the issue. 00:05:52.040 |
uses a linear tool that it has provided through an MCP server 00:06:00.940 |
I didn't have to add mention it or prompt it in any special way. 00:06:04.060 |
It just knew to use that tool to fetch that piece of context. 00:06:08.000 |
And it's going to do a couple more agentic steps here 00:06:10.960 |
to find the appropriate context within the code base. 00:06:13.500 |
But I just want to point out that this linear tool, 00:06:18.160 |
It is actually the official linear MCP server, 00:06:20.540 |
which I currently think is one of the best MCP servers out 00:06:23.940 |
The way we've implemented it is you just plug in the URL. 00:06:31.320 |
that I'll talk about a little bit later that kind of secures 00:06:33.560 |
the connection and handles the secret exchange 00:06:37.180 |
with linear or whatever upstream service you're talking to. 00:06:40.680 |
And that's how we're integrating this capability 00:06:52.380 |
is talk a little bit more about the AMP architecture 00:07:11.100 |
It integrates MCP servers in the same way as the editor 00:07:17.340 |
What are the main architectural components of AMP? 00:07:31.300 |
And then while that's running, I actually just 00:07:36.800 |
I don't know if any of you are aware that there's 00:07:39.300 |
this thing people do now where they watch two unrelated YouTube 00:07:43.380 |
videos side by side, it's like a Gen Z thing. 00:07:49.160 |
I thought, what is the coding equivalent of that 00:07:52.740 |
And so I was thinking, OK, instead of playing a game 00:08:00.540 |
So let's use MCP again and say, find the linear issue 00:08:12.080 |
And we're going to try to vibe code 3D Flappy Bird on the side 00:08:18.960 |
So over here, we got a pretty detailed textual explanation 00:08:27.640 |
But a picture is worth many more words than just text. 00:08:31.060 |
So why don't we ask it, can you draw a diagram showing me 00:08:37.400 |
how these components connect and communicate? 00:09:00.920 |
so it's using a lot of tool calls along the way. 00:09:03.500 |
Each one of these long text things is a tool call. 00:09:07.400 |
It actually uses this other MCP server, the Playwright MCP 00:09:11.340 |
server, to interact with the browser and take screenshots. 00:09:14.900 |
And it's going to use that as part of its feedback loop 00:09:37.100 |
it's not-- most of the time it's not worth it diving in 00:09:40.940 |
I just ask it to rerun, and most of the time it just works. 00:09:45.920 |
Over here, let's take a look at the architectural diagram 00:09:55.500 |
If this were in the editor, it would just show this. 00:10:08.120 |
Everything is routed through this kind of like server-s thread 00:10:10.420 |
component, which talks to the MCP integration, 00:10:13.680 |
and which in turn talks to all the services that we're 00:10:15.900 |
integrating through the model context protocol. 00:10:23.420 |
easy to grok what's happening in a large-scale code base. 00:10:28.460 |
Now let's see if 3D Flappy Bird is done here. 00:10:31.900 |
So it looks like it got as far as running a Python web server. 00:11:00.260 |
And then let's finally check back in on the first thing, 00:11:08.580 |
So while I was explaining to you how AMP works, 00:11:14.840 |
make a change in itself, explain its architecture, 00:11:25.020 |
Oh, and also it marked the linear issue as done, which is cool. 00:11:41.760 |
And we've really incorporated MCP in a very deep way. 00:11:45.120 |
So one of the things about AMP's architecture 00:11:49.620 |
we have an AMP client, and there's an AMP server, 00:11:52.080 |
and then there's these external services and local tools 00:11:59.400 |
to all these different types of tools and services, 00:12:04.380 |
So when we're talking to local tools like Playwright or Postgres, 00:12:14.220 |
whether it be first party services like Sourcegraph or Code 00:12:17.160 |
Search Engine, which is really good at searching 00:12:18.900 |
over large-scale code bases, or other services such as your issue 00:12:23.640 |
tracker or your observability tool, that talks MCP as well. 00:12:32.520 |
is also connect these MCP connections through a way that 00:12:37.080 |
securely handles secrets and forwards the identity of the user 00:12:43.840 |
One of the things that we realized in building this 00:12:46.540 |
is that there is kind of like a new emerging recipe for AI 00:12:51.580 |
So just like RagChat was kind of the model for the previous era, 00:12:55.980 |
I think this is the rough formula for the new era. 00:12:58.420 |
And we actually wrote a blog post about how this is not 00:13:03.080 |
Really, anyone can write an agent probably in the time it takes 00:13:08.800 |
And so if any of you want to try your own hand at writing 00:13:12.300 |
a simple coding agent, just go to that blog post, 00:13:16.540 |
But the recipe we found is you need maybe like four things. 00:13:21.400 |
to use LLM, which the latest Cloud model provides. 00:13:24.380 |
We're really excited about Cloud 4's capabilities. 00:13:28.060 |
We've been playing around with it for the past couple of weeks. 00:13:30.760 |
All the stuff that I just demoed was running off of Cloud 4. 00:13:40.340 |
And MCP just so happens to be the perfect solution for that. 00:13:49.600 |
we found is really important is to really focus 00:13:56.620 |
was making a change to itself, what it was doing 00:14:01.260 |
to take screenshots of changes it was making to the app along 00:14:04.540 |
the way and using those screenshots to validate 00:14:08.980 |
And that change was actually pretty non-trivial, 00:14:10.780 |
because the component hierarchy has a lot of containers. 00:14:25.560 |
And part of this, too, is if you design the feedback loops properly, 00:14:30.240 |
our thesis is that the UX becomes a lot more imperative. 00:14:33.400 |
So I think like the previous era of AI applications, 00:14:40.800 |
to invoke the chat-based models sort of in situ 00:14:55.440 |
And oftentimes, the best interaction we've found 00:14:58.160 |
is just ask the agent to do it and then refine the feedback 00:15:00.800 |
loops so that it's able to get those things done reliably. 00:15:06.180 |
In terms of tool usage in AMP, some of our top tools 00:15:11.000 |
So our most popular tools are probably the ones I listed above. 00:15:14.060 |
There's some local ones like Playwright and Postgres. 00:15:18.180 |
There's also a great tool to integrate web search. 00:15:22.440 |
So you can either use Anthropics Web Search API. 00:15:25.160 |
There's also Brave Web Search, which is really nice. 00:15:27.660 |
Context 7 is a popular MCP server that pulls in different documentation 00:15:31.920 |
corpuses and, of course, Linear, which I just showed, 00:15:35.140 |
which allows you to do this kind of issue to PR workflow. 00:15:43.700 |
They really focused in on the quality of the description 00:15:47.160 |
And that ends up being really essential to making MCP servers 00:15:51.820 |
And that actually leads me to one of the pitfalls 00:15:53.800 |
that we found in integrating MCP servers into our agent, 00:15:58.620 |
which is one of the traps that we see some people fall into 00:16:05.960 |
So it's this practice of, like, you know, MCP, MCP, MCP. 00:16:11.180 |
So you just want to go plug in, like, two dozen MCP servers, 00:16:17.440 |
And that sometimes, like, when you think about how it's implemented 00:16:20.300 |
underneath the hood, each one of those tool descriptions 00:16:22.680 |
gets shoved into the context window and can confuse the model. 00:16:28.120 |
But the more irrelevant stuff that goes into context, 00:16:31.640 |
the less intelligent it is about making selection 00:16:34.880 |
And it is about sort of like general reasoning 00:16:40.480 |
And so in terms of how we baked MCP into our application 00:16:46.240 |
limited the set of tools that a particular MCP server provides 00:16:50.220 |
to a smaller subset that we think are really essential to the workflows 00:16:55.100 |
And roughly speaking, there's kind of three buckets of tools 00:16:58.380 |
There's the ones that are devoted to finding relevant context. 00:17:01.900 |
There's the ones that can provide high quality feedback, 00:17:05.360 |
such as, like, invoking unit tests or invoking the compiler. 00:17:08.600 |
And then finally, the ones that are involved in submitting done 00:17:12.160 |
or declaring success, like marking the issue as done, 00:17:15.260 |
or pinging the user to say, hey, I'm ready for your feedback 00:17:22.920 |
I also want to talk a little bit about securing MCP. 00:17:25.420 |
So that's a high priority for us, given how many of our customers 00:17:28.480 |
are in kind of like large scale production code bases. 00:17:31.820 |
So the original MCP spec didn't have anything around auth. 00:17:35.040 |
They've since integrated OAuth 2 as the kind of designated 00:17:38.200 |
authentication protocol, which I think was a really smart decision. 00:17:40.600 |
It's what we've used in the past to integrate 00:17:49.700 |
And I think a lot of what you see in the wild in terms 00:18:00.160 |
Even with the existence of remote MCP servers, 00:18:04.460 |
someone made this NPM plugin that just converts a remote MCP 00:18:10.540 |
So the application feels like it's still talking over standard I/O. 00:18:14.940 |
And it handles the auth handshake, but as a consequence, 00:18:17.260 |
it just shoves the secrets, like your secret tokens 00:18:20.300 |
to your other services in some random plain text directory. 00:18:25.180 |
And that's kind of like a no-go for a lot of our customers. 00:18:28.380 |
And so as part of this handshake, we actually 00:18:30.620 |
implemented a secure secret store where the AMP server takes care 00:18:34.200 |
of the OAuth handshake, and it proxies the MCP connection 00:18:52.340 |
So there's been a lot of great talk at this conference 00:18:59.740 |
is the extent to which sub-agents can be the way 00:19:06.300 |
but the way that AMP was actually gathering context 00:19:08.540 |
about the code base was actually a sub-agent. 00:19:17.420 |
going in and invoking different low-level search 00:19:23.060 |
about the context it gathers, refining queries to gather 00:19:27.980 |
And we found that approach works just super, super well. 00:19:35.820 |
to become really good tools, we're only scratching the surface 00:19:46.200 |
And I think we touched upon this in some of the earlier talks 00:19:51.840 |
So right now, the tool calling paradigm is largely, you have a static list of tools, 00:19:55.840 |
and the model will go and invoke each one, one by one, look at the output, 00:20:01.420 |
And a lot of people, not just us, have pointed out, hey, it might be useful if we incorporated 00:20:06.000 |
a notion of output schema into MCP, which that just got merged in. 00:20:11.060 |
And then the model can sort of like plan out how to invoke these tools and compose them 00:20:15.900 |
And if you squint, at some point, you're basically programming, right? 00:20:20.320 |
Like you have these functions, you have these tools, you're composing them, 00:20:25.700 |
And so I think it's a really good time to revisit code interpreters, 00:20:31.560 |
which was a thing that, you know, at first was a thing in 2023, 00:20:37.420 |
I think with the advent of tool calling agents, 00:20:38.960 |
there's a lot more potential to explore there and something 00:20:44.020 |
I was talking with David from Anthropic earlier this week. 00:20:46.880 |
He stopped by our office, and we're talking about a lot of the parallels 00:20:49.620 |
between tool calling LLMs and agents and the kind of discourse 00:20:55.960 |
that was active when high-level programming languages first became a thing. 00:20:59.940 |
So before, you know, programming languages settled on, you know, 00:21:04.040 |
the abstractions that kind of dominate today, 00:21:07.680 |
what's the proper abstraction for a subroutine? 00:21:12.600 |
Do-- is there a way to manage concurrent communication effectively? 00:21:18.880 |
And I think we're just going to revisit all of that now, 00:21:21.560 |
because the analogies to, you know, programming languages 00:21:25.740 |
now with, you know, agents and sub-agents and how they interact 00:21:28.820 |
with each other and also with deterministic systems, 00:21:34.120 |
And then the last point here is I think the way that most people integrate MCP 00:21:38.640 |
right now, if you look at the part of the spec that the vast majority of MCP 00:21:41.780 |
clients implement versus the protocol itself, the entire protocol, 00:21:46.840 |
I think the protocol designers here were very, very forward-thinking. 00:21:49.720 |
There is a lot baked into the protocol like stateful session management, 00:21:55.780 |
which is when the MCP server calls the client to do LLM inference. 00:22:00.580 |
Like, the vast majority of MCP connections right now are stateless tool calls 00:22:09.060 |
Like, we're literally scratching the tip of the iceberg 00:22:11.620 |
for what we can do with tool calling LLMs and tools provided 00:22:20.100 |
The next year, I think it's going to get really weird. 00:22:23.800 |
So I guess, you know, we're super excited to be here today. 00:22:28.360 |
It's been a great partnership with Anthropic for the past, 00:22:34.640 |
I think we're one of the earliest adopters of Claude 00:22:42.440 |
And we're trying to build AMP from the ground up 00:22:45.000 |
into this kind of like tool calling native coding agent. 00:22:49.840 |
And so if that's of interest of you, check us out. 00:22:54.960 |
that people kind of build on top of it, build with it, 00:22:57.900 |
and also a lot of the MCP servers that people will build 00:23:01.820 |
So if you're doing that, please get in touch with us as well. 00:23:04.320 |
We'd love to hear from you and figure out how this can fit 00:23:06.420 |
into the kind of new way of doing software development 00:23:13.880 |
With that, do I have time for questions or one question? 00:23:23.820 |
So how are you thinking about Kodi and AMP playing together? 00:23:29.480 |
So we also have this AI coding assistant called Kodi that 00:23:33.700 |
was one of the first RagChat context-aware coding assistants. 00:23:40.720 |
Recall the earlier picture where it was kind of like the three 00:23:47.000 |
Kodi was an awesome AI application built for the RagChat era of models. 00:23:51.000 |
And I think there's still a lot of organizations that will find a lot of value 00:23:54.100 |
from that paradigm and still a lot of workflows that can benefit from that. 00:23:57.000 |
But because the underlying assumptions of what LLMs can do have changed so much, 00:24:02.100 |
we think that the best user experience for the agentic world 00:24:06.900 |
is going to come from an application that was designed from the ground up 00:24:14.080 |
And that's why we built AMP as a separate application and thing from Kodi. 00:24:22.040 |
And if I'm being a little bit snarky, I think if you're not rethinking 00:24:26.100 |
that architecture, I think as an application developer, 00:24:29.300 |
you're at risk of falling behind and missing the next wave of AI development.