back to indexRemote MCPs: What we learned from shipping — John Welsh, Anthropic

00:00:00.000 |
awesome thanks so much for coming I wanted to give a bit of a talk on implementing MCP clients 00:00:23.460 |
and talking about MCP at scale within a large organization like Anthropic I wanted to give 00:00:30.420 |
first a little introduction for me my name is John I've spent 20 years building large-scale systems 00:00:35.820 |
and dealing with problems that that causes and so I've made a lot of mistakes and I'm excited to 00:00:42.540 |
give maybe some thoughts on avoiding some of those mistakes I'm currently a member of technical staff 00:00:47.520 |
here at Anthropic and I've spent the past few months focusing on tool calling and integration and 00:00:53.520 |
implementing MCP support for all of our internal like external integrations within the org and so 00:01:00.180 |
looking at tool integration with models we've kind of hit this timeline where models only really got 00:01:07.200 |
good at calling tools like kind of late mid last year and suddenly everyone got very excited because 00:01:16.740 |
like your model could go and call your Google Drive and then it could call your maps and then it could 00:01:21.240 |
send a text message to people and so there's a huge explosion with like very little effort you can make 00:01:26.160 |
very cool things and so teams are all trying to move fast everyone's moving very fast in AI custom endpoints 00:01:32.280 |
and then start proliferating for every use case there's a lot of like services popping up with like 00:01:36.840 |
slash call tool and slash like get context and then people start to realize the digital needs of 00:01:43.920 |
some authentication there's a bunch of stuff there and this kind of led to some integration chaos where 00:01:49.380 |
you're duplicating a bunch of functionality around your org nothing really works the same you have an 00:01:55.800 |
integration that works really well in service a but then you want to use it in service b but it you can't 00:02:01.020 |
because it's going to take you three weeks to rewrite it to talk to the new interface and so we're in this kind of 00:02:08.040 |
the place that we came to at anthropic is realizing that over time all of these endpoints started to look a lot like mcp 00:02:15.020 |
you you end up with some get tools some get resources some 00:02:25.140 |
even if you're not using the entire feature space of mcp as a whole immediately like 00:02:30.200 |
like you're probably going to go extend into something that kind of looks like it over time and 00:02:35.480 |
when i'm talking about mcp here there's kind of two sides to mcp that in my mind feel a bit unrelated there's this json rpc specification which is really valuable as engineers it's like a standard way of sending messages and communicating back and forth between providers of context for your models and the code that's interacting with the models and 00:02:56.760 |
uh getting those messages right and getting those messages right is the topic of huge debate on like the mcp 00:03:02.760 |
uh repos if you're involved with any standardization process ever you know how those conversations end up going and then on the other side there's this global transport standard which is the stuff around streamable hdp 00:03:19.320 |
global transport standard is hard because you're trying to get everyone to speak the same language and so 00:03:23.400 |
it's really nitty but there's not a lot of like 00:03:26.120 |
most of the juice of mcp is in this the message specification in the way that the 00:03:33.320 |
um and so we started asking ourselves like can we just mcp for everything and we said yes with the 00:03:39.240 |
caveat that yes is for everything involved in presenting model context to models um we have this format where 00:03:47.320 |
your client is sending these messages something's responding with these messages um where that stream is 00:03:54.280 |
going it really doesn't matter it can be on the same process it can be another data center it can be through a 00:04:00.760 |
giant pile of uh enterprise networking stuff um it doesn't really care at the point that your 00:04:08.200 |
code is interacting with it you're just calling a connect to mcp and you have a a set of a set of 00:04:13.640 |
tools and methods that you can call so uh standardizing on that seemed useful um standard why standardize anything 00:04:22.760 |
internally um being boring on stuff like this is good it's not a competitive advantage to be really 00:04:30.600 |
good at making google drive talk to your app it's just a thing that you need to do it's not your 00:04:36.280 |
differentiator uh having a single approach to learn as engineers makes things faster you can spend 00:04:43.640 |
your cycles working on interesting problems instead of trying to figure out how to plumb uh integration 00:04:49.080 |
and uh if you're using the same thing everywhere then like each new integration might clean up the 00:04:53.000 |
field a bit for the next person who comes along uh it's it's overall a good thing in cases like this 00:04:59.000 |
where we're we're not really doing interesting we're plumbing context between integrations and things 00:05:03.800 |
that are consuming the integrations uh why standardize on mcp internally um i and this is where i might 00:05:11.400 |
make an argument to everyone that there's already ecosystem demand you have to implement mcp because 00:05:16.280 |
everyone's implementing mcp so why do two things um it's becoming an industry standard there's a large 00:05:23.400 |
coalition of engineers and organizations that are all involved in building out the standard uh all of 00:05:28.680 |
the major ai labs are represented in that so you you know that as new model capabilities start to be 00:05:35.240 |
developed those patterns will be added to the protocol because all the labs want you to use their features 00:05:41.480 |
so i think the standardizing on mcp internally for this type of context is a it's a good bet and one of the 00:05:47.560 |
things you get with mcp is that it solves problems that you haven't actually run into yet like there's a bunch of stuff in the protocol 00:05:53.240 |
that exists because there's a problem and a need and having those solutions at hand when you run into 00:05:59.640 |
them is really important so sampling an example of where this might be valuable in your company 00:06:03.320 |
you might have four products that have four different billing models uh for reasons because you're building 00:06:09.000 |
fast um you might have a bunch of different token limits you might have different ways of tracking 00:06:13.400 |
usage this is really painful because you want to write one integration service to connect your slides and 00:06:21.080 |
how do you go and like hook the billing and the tokens up correctly and mcp has uh already has sampling 00:06:27.720 |
primitives so you can build your integration you can just be like okay your integration sends a sampling 00:06:32.680 |
request over the stream uh the other end of the pipe fulfills that request you can go and hook it in 00:06:37.400 |
everything works great and so this is a thing that uh a shape problem that might take you a bunch of effort 00:06:43.480 |
internally without this but you already have the answer kind of gift wrap for you in the protocol 00:06:49.880 |
and so at anthropic we're running into some requirements converging we're starting to see 00:06:53.720 |
external remote mcp services popping up like mcpu.asana.com which is really cool we wanted to be able 00:06:59.720 |
to talk to those talking to those is complex because you need external network connectivity you need 00:07:05.240 |
authentication uh there's a proliferation of internal agents people have started building 00:07:13.160 |
pr review bots and like slack management things and just lots of people have lots of ideas no one's really 00:07:19.880 |
sure what's going to hit so we're having a huge explosion of llm backed services internally uh with 00:07:25.240 |
that explosion there's a bunch of security concerns where uh you don't really want all of those services to be 00:07:32.680 |
going and accessing user credentials uh because that ends up being and being kind of a nightmare you don't want 00:07:39.560 |
uh outbound external network connectivity to be everywhere um auditing becomes really complex uh 00:07:45.640 |
and so we are looking at this problem we wanted to be able to build our integrations once and use 00:07:51.720 |
them anywhere and so a model i was introduced to by a mentor of mine and a while ago is the pit of success 00:07:59.480 |
which is the idea that um if you make the right thing to do the easiest thing to do then everyone 00:08:09.160 |
in your org kind of falls into it and so uh we designed a service which is just a piece of shared 00:08:15.880 |
infrastructure called the mcp gateway that provides a single point of entry and provided engineers just 00:08:20.680 |
with a connect to mcp call that returns a mcp sdk client session on the end and we're trying to make 00:08:27.160 |
that as simple as possible uh because that way people will use it if it's the easiest thing to do 00:08:35.000 |
um we use url based routing to route to external servers internal servers it doesn't matter it's all 00:08:39.720 |
the same call uh we handle all the credential management automatically because you don't want to be 00:08:44.200 |
implementing oauth five times in your company uh it gives you a centralized place for rate limiting and 00:08:49.880 |
observability uh i have an obligatory diagram here of a bunch of lines going in and out but uh 00:08:55.960 |
uh here's a gateway in the middle this is kind of the thing just one more box will solve all our 00:09:00.760 |
problems uh can i go next uh where is my yeah uh so the uh the code that we have here we just made 00:09:13.320 |
some client libraries we just mcp gateway connect to mcp uh we pass in a url an org id account id this is 00:09:20.200 |
like a bit simplified we actually pass a signed token to authenticate because it's accessing 00:09:24.200 |
credentials but this is the basic idea and then importantly this call returns an mcp sdk object 00:09:31.480 |
which means that when you feature get added to the protocol you just update your mcp packages 00:09:36.360 |
internally you get those features across the board everything works great the same code seamlessly 00:09:41.160 |
connects to internal external integrations when it comes to transports uh and this is a bit high level on 00:09:48.200 |
hand wave because everyone's setup is different um internally within your network it really doesn't 00:09:53.800 |
matter you can do anything you want we've got the standardized transport for connecting to external 00:09:57.720 |
mcp servers um but really just picking the best thing for your org so we went and picked 00:10:04.600 |
websockets for our internal transport and here's just a quick code example it's nothing special we just 00:10:13.640 |
have a web socket uh that's being opened we are sending these json rpc blobs back and forth over 00:10:18.680 |
the web socket and then if i can make this scroll down at the end we just pipe those uh read streams 00:10:29.000 |
and write streams into an mcp sdk client session and we're good to go we've got mcp going um you might 00:10:35.560 |
want to do this with grpc because you want to wrap these in some multiplexed transport so you don't have to 00:10:39.880 |
open one rub socket per connection that's pretty simple also uh we have read stream right stream at 00:10:45.800 |
the end uh starting to see a pattern here you can do like unix socket transport if you want you can just 00:10:51.000 |
have uh messages be passed that way read stream right stream at the end mcp works great um i threw in an 00:10:56.920 |
enterprise grade email transport implementation over imap um which is pretty much the same thing you just go 00:11:03.240 |
through here is our server we're sending emails back and forth uh dear server i hope this finds you 00:11:09.800 |
well uh mcp request start and then we pipe those into a client session at the end and so it truly 00:11:15.720 |
doesn't matter like whatever it takes inside your organization is great we set up this unified 00:11:21.800 |
authentication model where we're handling oauth at the gateway which means that consumers don't have to 00:11:28.600 |
worry about all that uh all that complexity in their apps we added a get oauth authorization url function 00:11:36.840 |
and a complete oauth flow because you might have different endpoints at anthropic we have api.anthropic.com 00:11:42.120 |
and we have claw.ai and we might want those redirects to go back to different places 00:11:45.640 |
but uh this is tied on the gateway it's really easy to start a new authentication a real advantage of having 00:11:51.880 |
this put on your gateway is that the credentials are portable if you have a batch job that you're kicking 00:11:57.480 |
off um your users don't have to re-authenticate to that you're just calling the same mcp with your 00:12:02.600 |
internal user id and they get everything added correctly you're also internal servers don't have 00:12:07.240 |
to worry about your tokens um so your request comes in internally for us we're hitting a websocket 00:12:14.600 |
connection to mcp gateway uh with the auth token provided as headers to that uh that gateway receives 00:12:21.400 |
your stored credentials you create an authenticated sdk client you just pass in the bearer token to the 00:12:28.360 |
auth header uh and then you're good to go the mcp client receives a read stream and a write stream and so 00:12:35.320 |
you just plumb those reach from the right stream into your internal transport and you're and you're you're 00:12:39.800 |
good to go uh one of the things that this gives for your org that's not immediately obvious but is 00:12:46.040 |
really valuable is a central place for all of your context that your models are asking for and all the 00:12:53.160 |
context is flowing into your models for your org there's some papers written on mcp prompt injection 00:13:01.000 |
attacks there is a general risk of uh models going and having access to google drive and deleting 00:13:07.960 |
everything in google drive there's some need of uh enforcing policy you might want to be able to 00:13:13.720 |
like ban malicious servers um do some content classification on the request see what's coming 00:13:21.560 |
in kind of given audit and the really nice thing about this is that because it's mcp all of your 00:13:26.680 |
messages are in a standardized format so it's really easy to hook into that stream and be like okay 00:13:30.760 |
here is my tool execution message processor here is my tool definition thing or here's my resource 00:13:36.360 |
management and so the payoff that you get from this is um adding mcp support to new services is as 00:13:43.400 |
simple as possible you just go and import a package it doesn't matter what language you're in we've got 00:13:49.000 |
multiple languages internally they all have their own kind of packages engineers can focus on building 00:13:53.880 |
features and not plumbing uh you have the operational simplicity of having a single point of ingress egress 00:13:59.400 |
and standardized message formats and you get future features for free as the protocol involves you get 00:14:05.080 |
all of that work uh naturally and so i just wanted to go through some takeaways from this that i want to 00:14:14.680 |
put to put to you is that mcp is really just json streams and how you pipe those streams around your 00:14:19.880 |
infrastructure is a small implementation detail it's a couple lines of code hook the stream into the 00:14:24.520 |
client sdk that makes the messages uh the you should standardize on something anything i think mcp is a good 00:14:33.480 |
idea if you don't think it's a good idea like just pick something um your future self will thank you 00:14:37.640 |
build some pits of success you really want to make the right way to do a thing the easiest way to do a 00:14:43.480 |
thing and then everyone just falls into doing the right thing naturally and also centralizing at the 00:14:48.440 |
correct layer so solving some shared problems off and external connectivity once allows you to spend 00:14:54.120 |
your time working on uh more interesting problems that are more valuable to you and your your business 00:14:59.320 |
your business um thanks that's all i got for you uh thank you so much for coming out