back to indexMCP 201 | Code w/ Claude

00:00:23.000 |
Today I'm going to tell you a little bit more about the protocol 00:00:28.000 |
just to give you an understanding of what there's more to the protocol 00:00:33.000 |
than what most people use it for at the moment, which would be tools. 00:00:37.000 |
So really the goal today is to showcase you what the protocol is capable of 00:00:43.000 |
and how you can use it in ways to build richer interactions with MCP clients 00:00:48.000 |
that goes beyond the tool calling that most people are used to. 00:00:54.000 |
And I will first go through all the different like what we call primitives, 00:00:58.000 |
like ways for the servers to expose information to a client 00:01:03.000 |
before we go into some of the bit more lesser known aspects of the protocol. 00:01:08.000 |
And then I want to talk a little bit about like how to build a really rich interaction 00:01:13.000 |
before we take a little stab of what's coming next for MCP 00:01:20.000 |
But to just get you started, I want to talk about one of the MCP primitives 00:01:25.000 |
that servers can expose to MCP clients, but very few people know. 00:01:32.000 |
And what prompts are really are predefined templates for AI interactions. 00:01:39.000 |
And that's to say it's a way for an MCP server to expose a set of texts, 00:01:45.000 |
you know, like a prompt in a way that allows users to directly add this to the context window 00:01:55.000 |
and see how they would use, for example, the MCP server you're building. 00:02:01.000 |
And they're really the two main use cases here is for you as an MCP server author 00:02:10.000 |
to provide an example for that you can showcase to the user 00:02:15.000 |
so that they know how to use the MCP server in the best way. 00:02:20.000 |
Because realistically, you are the one who has built it. 00:02:22.000 |
You are the one who knows how to use it in the best possible way. 00:02:26.000 |
And probably at the time you would release it are the one who has used it the most time. 00:02:30.000 |
But since MCP prompts are also dynamic in a way, 00:02:37.000 |
they're just code under the hood that are executed in MCP server, 00:02:41.000 |
they allow you to do even richer things than that. 00:02:45.000 |
What you can do, and I want to showcase this in this scenario, 00:02:48.000 |
is an MCP prompt that a user invokes in this Z editor here 00:02:55.000 |
that will fetch directly GitHub comments into my context window. 00:03:04.000 |
And so what you see me here doing is just basically put into the context window 00:03:09.000 |
the comments from my pull request that I've written so that I can go and interact with it 00:03:16.000 |
and have then the model go and help me apply the changes that's been requested to me 00:03:26.000 |
And so this is really a way for exposing things that the user should directly interact 00:03:32.000 |
and the user should directly want to put into the context window 00:03:38.000 |
So it's different from that from tools where the model decides when to do it. 00:03:48.000 |
And if you look carefully, there's one additional thing that very, very few people know 00:03:54.000 |
that you can do, and that is prompt completion. 00:03:57.000 |
So if you have looked carefully, there was a way where it showcased quickly a pop-up 00:04:02.000 |
of me selecting the pull requests that are available to myself. 00:04:07.000 |
And that is a way that you can -- that is a thing that you can provide as an MCP server author 00:04:13.000 |
to build richer parameterized templates, for example. 00:04:17.000 |
And this is exceptionally easy to do in the code. 00:04:20.000 |
Like if you were in TypeScript, building a prompt that provides users with like such a template 00:04:27.000 |
and have parameters for it and like auto-completion is nothing more than a few lines of code 00:04:33.000 |
that cloud code together with cloud four can most of the time write basically for you. 00:04:42.000 |
It's a function for the completion and it's a function for generating the prompt. 00:04:45.000 |
And so this is already like one of these primitives you can use to build an interaction for users 00:04:50.000 |
with an MCP server, but it's just a little bit more richer than a tool call. 00:04:55.000 |
And a second one of these is something that we call resources. 00:05:00.000 |
It's another primitive that an MCP server can expose to an MCP client. 00:05:04.000 |
And while prompts are really focused on text snippets that a user can add into the context window, 00:05:12.000 |
resources are about exposing raw data or content from a server. 00:05:22.000 |
One thing is most of the clients today would allow you to add this raw content directly to the context window. 00:05:29.000 |
So in that way, they're not that different from prompts. 00:05:33.000 |
But it also allows application to do additional things to that raw data. 00:05:41.000 |
And that could be, for example, building embeddings around this data that server exposes, 00:05:47.000 |
and then do retrieval augmented generation by adding to the context window the most appropriate things. 00:05:57.000 |
And so this is an area that at the moment I feel is a bit underexplored. 00:06:01.000 |
And I just want to quickly showcase you how resources work. 00:06:05.000 |
In this case, this is, again, one of these ways where an MCP client exposes a resource as a file-like object. 00:06:15.000 |
And in this scenario here, we are exposing the database schema for a Postgres database as resources. 00:06:23.000 |
And then you can add them in Cloud Desktop just like files. 00:06:27.000 |
And that way you can tell Cloud, this is the tables I care about, and now please go ahead and visualize them. 00:06:33.000 |
And so in this scenario, what you're going to see is Cloud is going to go and write a beautiful diagram that visualizes the database schema for me. 00:06:47.000 |
There's a lot of unexplored space still here, again, if you go beyond just adding a file again and think about retrieval augmentation 00:06:54.000 |
or any other thing the application might want to do. 00:07:01.000 |
One is prompts, again, the things that a user interacts with. 00:07:05.000 |
The second one is resources that the application interacts with. 00:07:09.000 |
Then, of course, there should be a third one that you all are very familiar with, that I don't want to get into too much depth, 00:07:16.000 |
because if you have built an MCP server, you probably have built it for exposing a tool. 00:07:22.000 |
And so tools are really these actions, of course, that can be invoked. 00:07:27.000 |
That's like one of the, I think, most magical moment I feel when you build an MCP server is when the model for the first time invokes something that you care about, 00:07:38.000 |
that you have built for and has this little impact on, you know, it might be like carrying a database for you or whatever it might be. 00:07:45.000 |
But this is, again, the thing that the model decides when to call to an action. 00:07:51.000 |
And so these are three very basic primitives that the protocol exposes. 00:08:01.000 |
And if you think carefully about these three primitives that I just showcased to you, there's a little bit of overlap about, like, how do you use, like, when do you use what, really? 00:08:13.000 |
And so there's something that we don't talk enough about, and it's somewhere buried in the specification language of the model context protocol, 00:08:26.000 |
And I think showcasing it hopefully makes clear when you use what. 00:08:32.000 |
Because the interaction model is built in such a way that you can expose the same data in three different ways, depending on when you want to have it show up. 00:08:43.000 |
And prompts, again, are these user-driven things. 00:08:46.000 |
It's the thing the user invokes, adds to the context window. 00:08:49.000 |
And the most common scenario where how you see these pop up is a slash command, an add command, something like that. 00:08:58.000 |
Resources, on the other hand, are all application-driven. 00:09:02.000 |
The application that uses the LLM, be it Cloud Desktop, be it VS Code, something like that, fully decides what it wants to do with that. 00:09:12.000 |
And then, lastly, tools are driven by the model. 00:09:15.000 |
And between, you know, an AI application using a model and a user, we have all three parts that we eventually cover using these three basic primitives. 00:09:28.000 |
And that allows you already to go to a little bit of a richer application and experience than what most people can currently do with tools. 00:09:38.000 |
Because you just have a way to interact with the user a bit more nuanced than if you just wait for this model to call the tool. 00:09:51.000 |
Because while these basic primitives get us a little bit further than what we see most MCP servers do at the moment, there are even richer interactions that we want to enable. 00:10:03.000 |
And to make this a bit more understandable, here's really an example I want to give you that showcases this problem. 00:10:11.000 |
So how can you build an MCP server, for example, that summarizes a discussion from an issue tracker? 00:10:18.000 |
So on one side I can build an MCP server that exposes this kind of data, very simple. 00:10:27.000 |
Because for the summarization step, I obviously need a model. 00:10:31.000 |
And so one way to go and build this is you can build an MCP server that is this issue tracker server. 00:10:41.000 |
You can bring your own SDK and call the model, have the model summarizes. 00:10:50.000 |
And the problem is that the client has a model selected, be it like Claude or whatever else. 00:10:55.000 |
But the server, the MCP server that you've built, it doesn't know what model the client has configured. 00:11:04.000 |
And so you bring your own SDK off of a model provider, and be it the anthropic SDK, you still need them, like an API key that this user needs to provide. 00:11:16.000 |
And so MCP has a little hidden feature or a little primitive called sampling that allows a server to request a completion from the client. 00:11:30.000 |
It means that the server can use a model independently from, like, don't having to provide an SDK itself, but asks the client which model you have configured. 00:11:45.000 |
And the client is the one providing the completion to the server. 00:11:52.000 |
First of all, it allows the client to get full control over the security, privacy, and the cost. 00:11:58.000 |
So instead of having to provide an additional API key, you might tap into the subscription that your client might already have. 00:12:04.000 |
But it allows also a second part, which is that if you take multiple, if you chain MCP servers in an interesting way, it makes this whole pattern very recursive. 00:12:28.000 |
You can take an MCP server that exposes a tool. 00:12:32.000 |
But during the tool execution, you might want to use more MCP servers downstream. 00:12:37.000 |
And somewhere downstream in this, like, system, there might be then your Azure Tracker server that wants to go and have a completion. 00:12:47.000 |
But using sampling, you can bubble up the request such that the client always stays in full control over the cost of the subscription, whatever you want to use. 00:12:57.000 |
It stays in full control of the privacy over the cost of this interaction and basically manages every interaction that an MCP server wants to do with a model. 00:13:08.000 |
And that allows for very powerful chaining and it allows for more complex patterns that go already into ways of how you can build little MCP agents. 00:13:26.000 |
Sampling at the moment is sadly, I think, one of the more exciting features, but also one of the features that's the least supported in clients. 00:13:35.000 |
For our first-party products, we will bring sampling somewhere this year. 00:13:46.000 |
And so then you can hopefully start building more exciting MCP servers. 00:13:52.000 |
And then there's the last primitive that I want to touch on that's also a bit more interesting. 00:13:57.000 |
It's one of these things that, in retrospective, as one of the person who has built the protocol, I've probably named terribly, to be fair. 00:14:07.000 |
You will see this throughout the talk probably. 00:14:14.000 |
Because let's imagine I want to build today an MCP server that deals with my Git commands. 00:14:28.000 |
So now I'm going to hook up an MCP server into my favorite IDE. 00:14:32.000 |
But how does the IDE know, how does the MCP server know what are the open projects in the IDE? 00:14:41.000 |
Because obviously I want to run the Git commands in the workspaces I have opened, right? 00:14:46.000 |
And so roots is a way for the server to inquire from the client, such as VS Code, for example, what are the projects you have opened, so that I can operate within only those directories that the server has opened. 00:15:04.000 |
And I know where I want to execute my Git commands. 00:15:06.000 |
And this again is a feature that is not that widely used, but for example VS Code currently does support this. 00:15:14.000 |
And so these are, you know, just all the big primitives that MCP offers. 00:15:21.000 |
So we have five primitives, three on the server side, two on the client side. 00:15:25.000 |
But how do you put it all together to actually build a rich interaction? 00:15:30.000 |
We want to build something for users that's a bit richer than just tool calling. 00:15:35.000 |
And so let's take a look at how you will build a hypothetical MCP server that interacts with your favorite chat application, be that Discord, be it Slack. 00:15:46.000 |
You could use prompts to give examples to users, such as, like, summarize this discussion. 00:15:53.000 |
And you can use completions with recent threats, users, or whatever you want them to expose. 00:16:00.000 |
You can have additional prompts, like, what's new? 00:16:05.000 |
And so that's one way the user can just kickstart right away into using the server you've provided and get the ideas that you, how you intended it to be used. 00:16:18.000 |
And then you can use resources to directly list the channels, to expose recent threats that happen in the, you know, chat application, such that the MCP client can index it, deal with it in ways that it wants. 00:16:37.000 |
And then, of course, last but not least, we still have our tools. 00:16:40.000 |
We have search, we have read channels, we have reading of threats, and we will use sampling to summarize a threat, for example, and really expose this. 00:16:50.000 |
And that's a way to really build a much, much, much richer experience with MCP to use the full power that the protocol has to offer. 00:16:58.000 |
But this is just the beginning, because most of these experiences, if we build MCP servers so far, have been experiences that stayed local. 00:17:11.000 |
Out of the 10,000 MCP servers the community has built over the last six to seven months, the vast majority are local experiences. 00:17:19.000 |
But I think we can take the next step, and I think this is MCP's really big next thing, is bringing MCP servers away from the local experience to the web. 00:17:35.000 |
It means that instead of having an MCP server that is, you know, a Docker container or some form of local executable, 00:17:42.000 |
it is nothing else but a website that your client can connect to and expose this MCP and you talk to. 00:17:53.000 |
But for that, we need two critical components: we need authorization and we need scaling. 00:18:03.000 |
And in the most recent specification of MCP, we made a ton of changes towards this from the lessons we have learned and the feedback we honestly got from the community as well as key partners. 00:18:18.000 |
And we work closely, for example, with, like, people in the industry that worked on OAuth and other aspects to get this right. 00:18:29.000 |
And so with authorization in MCP, what you can do is you can basically provide the private context of a user that might be behind an online account or something directly to the LLM application. 00:18:45.000 |
And it really enables MCP authors to bind the capability of the MCP servers to a user or an online account or something like that. 00:18:55.000 |
And in order to do that, the way this currently has to work in MCP is that you do this by providing OAuth as the authorization layer. 00:19:05.000 |
And the MCP specification basically says you need to do OAuth 2.1, and that's a bit daunting because very few people know what OAuth 2.1 is. 00:19:14.000 |
But OAuth 2.1 is usually just OAuth 2.0 with all the basic things you would do anyway, all the security considerations that people that have done OAuth telling you anyway to do. 00:19:25.000 |
So it's just OAuth 2.0, a little bit cleaned up, and you're probably already doing it if you're doing OAuth. 00:19:32.000 |
And if you do implement this OAuth flow, you get two interesting patterns out of that. 00:19:38.000 |
And the first one is the scenario of an MCP server in the web. 00:19:44.000 |
And a good example of this is if you're, for example, a payment provider, and you have, you know, a website, payment.com, and I, as a user, have an online account there. 00:19:55.000 |
Now I, as the payment provider, can expose mcp.payment.com that the user can put into an MCP client, and the MCP client will do the OAuth flow. 00:20:06.000 |
I log in as my account, and I know this is payment.com. 00:20:10.000 |
I know this is the person that is my online account with the provider that I trust. 00:20:16.000 |
I don't trust some random Docker container running locally built by a third-party developer anymore. 00:20:21.000 |
I trust the person I already trust with the data anyway and their developers. 00:20:25.000 |
And on their development side, they can just, like, update this server as they want, and they don't have to wait for me to download a new, like, Docker image. 00:20:35.000 |
And so this is, I think, will be a really, really big step for enabling MCP servers to be exposed on the web and MCP clients to interact basically with all the online interactions that you already have. 00:20:51.000 |
And here is just a small little example of this. 00:20:54.000 |
In this scenario, we use Cloud AI integrations, which we launched earlier this month, to connect to a remote server and use this OAuth flow to log in our user to then have tools available that are aware of my data, that I care about it, that it is for me. 00:21:18.000 |
It enables enterprises to smoothly integrate MCP into their ecosystem, how they usually build applications. 00:21:30.000 |
It means that internally they can deploy an MCP server to their intranet or whatever, like, they use, and use an identity provider like Azure ID or Okta or whatever that central identity provider that you usually use for single sign-on. 00:21:49.000 |
And you can have that still exist and it will be the one that gives you the tokens that you require to interact with the MCP server. 00:22:00.000 |
And that is a lot to say that what it ends up with is a very smooth integration. 00:22:04.000 |
You're, as a development team internally, you're going to build an MCP server that you control, that you could control the deployment. 00:22:11.000 |
And the user just logs in in the morning with their normal SSO like they always would do. 00:22:16.000 |
And any time they use an MCP server from them on out, they will just be logged in and have access to the data that, you know, that is their data that the company has for them. 00:22:27.000 |
And so this, I think, enables a new way that I've already seen some of the big enterprises do to build really vast systems of MCP servers that allow part of the company to build a server while the other part deals about the integrations. 00:22:45.000 |
It really nicely separates integration builder and platform builders. 00:22:50.000 |
And then the second part that we require is scaling. 00:22:54.000 |
And we just added a new thing called streamable HTTP, which is just to say, it's a lot of words to say, basically, we want MCP servers to scale similar to normal APIs. 00:23:08.000 |
You have, as a server author, you can choose to either return results directly, as you would be in a REST API, except that it's not quite just REST. 00:23:17.000 |
Or if you need to, you can open a stream and get richer interactions. 00:23:22.000 |
So in the most simple way, you just want to provide a tool call result. 00:23:26.000 |
You get a request, return application JSON, and off you go. 00:23:32.000 |
And the next connection come in and, you know, gets served by yet another Lambda function. 00:23:38.000 |
But if you need richer interactions, such as notification or features we talked about, like sampling, a request comes in, you start a stream, the client accepts the stream, and now you're being able to send additional things to the client before you're returning finally the result. 00:23:57.000 |
And those authorization and scaling together is really the foundation to make MCP go from this local experience now to be truly a standard for how LLM applications will interact with the web. 00:24:12.000 |
And just to finish it all up, I just want to show you quickly about like what's coming for MCP in the next few months of some of the most important highlights. 00:24:21.000 |
And the most important part is that we are starting to think more and more about agents. 00:24:33.000 |
There are asynchronous tasks that you, of course, want to run, things that are longer running, that are not just like a minute long, but maybe a few hours long. 00:24:43.000 |
Tasks that an agent takes and that eventually I want to have a result for them. 00:24:48.000 |
So we think a lot about that and we're going to work to build primitives for that into MCP in the near future. 00:24:55.000 |
The second part of that is elicitation, so really MCP server authors being able to ask for input from the user. 00:25:02.000 |
And that is something that's going to land just about today or on Monday in the protocol. 00:25:09.000 |
We first and foremost are going to build an official registry to make sure there's a central place where you can find and publish to MCP servers 00:25:17.000 |
so that we can really have one common place where we're going to look for these servers, but also allow agents to dynamically download servers and install them and use them. 00:25:29.000 |
And then, of course, we're thinking more about multi-modality. 00:25:33.000 |
And that can be, for example, streaming of results. 00:25:36.000 |
But that can have other aspects that I just don't want to go into details yet. 00:25:43.000 |
On the ecosystem part, we're going and having a lot of more things to go that we're doing at the moment. 00:25:49.000 |
We're adding a Ruby SDK that is donated by Shopify in the next few weeks. 00:25:54.000 |
And the Google folks, the Google Go team, is currently building an official Go SDK for MCP. 00:26:01.000 |
And so I just hope that I was able to give you a bit of a more in-depth view of what you could do with MCP if you used the full power of the protocol. 00:26:10.000 |
And with that, I think I'm a bit low on time, so I can't ask questions. 00:26:16.000 |
But just grab me afterwards, and I can be happy to answer around the hallway any questions you might have.