back to indexHow to Secure Agents using OAuth — Jared Hanson (Keycard, Passport.js)

00:00:18.500 |
that I consider one of the most important topics 00:00:27.520 |
I'm the co-founder of a new company called Keycard, 00:00:30.160 |
where we're building identity and access management platform 00:00:37.520 |
for any of the Node developers in the audience, 00:00:43.060 |
where I built a lot of their core identity infrastructure 00:00:51.800 |
about what's happening with LLMs and AI-powered applications. 00:00:56.080 |
We can bring these things into our daily lives 00:01:00.120 |
And simply put, agents that are more connected 00:01:03.540 |
So let's connect these agents to more systems. 00:01:23.520 |
which is we go get API keys that are typically long-lived 00:01:33.020 |
Now, if we continue this pattern for hundreds 00:01:36.580 |
we've got a pretty big security problem on our hand. 00:01:41.380 |
We know how to transition away from static secrets 00:01:48.480 |
how many people are familiar with OAuth in the crowd? 00:01:59.820 |
Like, OAuth is a relatively complicated protocol, 00:02:02.380 |
especially when you consider all the extensions. 00:02:04.280 |
But the principles behind it are fairly straightforward 00:02:25.580 |
and connected it to your Google Calendar API, 00:02:31.380 |
What's happening there is Calendly sends a request 00:02:41.920 |
that you're logged in, prompts you for consent 00:02:47.880 |
Google sends what's known as an access token over to Calendly. 00:02:56.480 |
There's a few other interesting bits going on here, 00:03:01.740 |
to be short-lived and rotated pretty quickly, 00:03:03.920 |
while still maintaining the authorized connection. 00:03:09.920 |
that involve user delegation authorization code flows. 00:03:13.220 |
And they typically happen via browser-based interfaces 00:03:15.920 |
that you've seen when you've used these types of applications. 00:03:19.520 |
Now, one thing that gets kind of confusing for people 00:03:22.580 |
is that OAuth is oftentimes used to implement things 00:03:26.480 |
like sign-in with Google or sign-in with Facebook. 00:03:30.180 |
And this is confusing because we refer to OAuth as an authorization 00:03:33.120 |
protocol or a delegated authorization protocol specifically. 00:03:35.960 |
So what's going on here when we use it for sign-in? 00:03:40.760 |
where the API gets replaced with a user info API 00:03:44.060 |
that just returns claims about the user who logged in. 00:03:46.460 |
So their ID, their name, their email address, et cetera. 00:03:49.660 |
And we kind of use authorization to back our way into authentication. 00:03:56.760 |
that people use with OAuth that it got formally standardized 00:03:59.620 |
as OpenID Connect, which is just an identity layer on top of OAuth 00:04:03.760 |
that standardizes the response format of that user info API. 00:04:08.020 |
It also does a couple of things that are kind of confusing, 00:04:10.020 |
like introduce more terminology, which identity people are prone to do. 00:04:13.820 |
We call the authorization server now an identity provider 00:04:19.520 |
And applications are known as relying parties. 00:04:32.800 |
which is a cryptographically signed statement about who the user is. 00:04:39.400 |
You can think of it as sort of an optimization 00:04:41.360 |
that the application can verify itself without making API requests. 00:04:45.400 |
It also serves some functions in like ongoing session management 00:04:48.720 |
between applications and authorization servers, 00:04:50.900 |
but that's kind of beyond the scope of introductory material here. 00:04:56.060 |
In the real world, these things get deployed together. 00:04:58.720 |
We'll typically run authorization and authentication flows in line 00:05:03.160 |
so that we know who the user is who logged in, as well as get access to things like their Google Calendar. 00:05:10.900 |
One thing to call out that is important here is that there's three roles in OAuth. 00:05:15.420 |
The client and the resource server, I think, are all relatively straightforward. 00:05:19.260 |
We understand that from client-server architectures. 00:05:21.260 |
The client requests resources, and the resource server responds with the data. 00:05:26.420 |
What gets different is that we introduce this authorization server in the middle that mediates this access, 00:05:34.420 |
It's just tokens back to the client, which holds them and then presents them to the resource server, 00:05:39.300 |
and the resource server's job is to verify those tokens. 00:05:42.420 |
Now, what's the benefit of this sort of model? 00:05:48.580 |
They don't have to care about anything to do with authentication anymore. 00:05:52.580 |
So, verifying user password or doing step-up authentication, running the consent flows. 00:05:56.760 |
They hand all that job off to the authorization server, and it gets kind of abstracted away by the token that the API can verify what has happened. 00:06:05.920 |
There's also some benefits that we can, like, centralize policy and then deploy ecosystems of apps and APIs, 00:06:12.820 |
all kind of protected by a central location, and build out the ecosystems that we all know today. 00:06:17.920 |
How do we apply this to MCP and agents in particular? 00:06:24.600 |
Now, our applications get replaced by a chatbot or agent like Claude that we want to connect to MCP servers. 00:06:33.760 |
The MCP clients and the MCP servers should get authorized via OAuth by, you know, the controlling authorization server in the middle. 00:06:44.360 |
Well, nothing with OAuth is ever so simple, so let's take a look at the state of authorization in MCP. 00:06:50.440 |
We're going to look at where it started, where it is now, and then where it's going in the future. 00:06:55.600 |
So, the first version of MCP, it's a pretty young protocol. 00:07:00.600 |
It's like seven months old to the day, I think. 00:07:02.600 |
The first version I like to call the no-auth version. 00:07:05.760 |
It didn't have any authorization in it at all, which they admitted in the spec. 00:07:09.760 |
It was really a way to get something out there, primarily for local MCP servers. 00:07:14.760 |
There was some notion of remote MCP servers, but, again, no authorization. 00:07:21.920 |
People saw the promise of MCP and started discussing how to add authorization to it. 00:07:26.920 |
Now we have the latest draft of the specification, which was published in late March. 00:07:33.920 |
I like to refer to this as OAuth the first attempt, and for anyone who has ever done OAuth implementations, 00:07:39.080 |
the first attempt is always pretty poor, and that is the case with this version of the specification of MCP. 00:07:46.080 |
I don't actually recommend anyone read the authorization part of the MCP specification as it is today because you'll walk away with a pretty misinformed view of what OAuth is. 00:07:55.080 |
But as a quick recap of what it does, it says, OK, MCP client's got to implement the client side of OAuth. 00:08:04.320 |
And then it also says, MCP servers, you need to implement all of OAuth too, including authentication, token issuance, et cetera. 00:08:17.720 |
Well, it got collapsed into the MCP server, which is a bit odd. 00:08:23.840 |
So five days after the specification was released, a blog post went viral. 00:08:29.480 |
This one from Christian Posta saying, "The MCP authorization spec is a mess for the enterprise." 00:08:35.480 |
And he states, you know, "The problem here is that it treats the MCP server as both a resource server and authorization server." 00:08:42.640 |
Aaron Parecki, who does a lot of great OAuth standards work, followed this up with another blog post that went viral titled, 00:08:49.000 |
"Let's fix OAuth and MCP," where he noted that, you know, a bunch of the confusion that was happening 00:08:54.640 |
was because the diagram showed that the MCP server itself is handling authorization. 00:09:00.280 |
Now, then this kind of culminated in a PR to the specification where people proposed, "Let's fix this problem. 00:09:08.280 |
let's just shift the MCP server to be an OAuth resource server and everything will be good." 00:09:17.920 |
It's not even the only PR there, but just kind of an example of how people just picked up on this problem and ran with it. 00:09:24.920 |
Now, I'm not usually one to say I told you so, but all the way back in January of this year, I commented on the, as a review for the specification, 00:09:35.560 |
I was like, "Hey, I recommend we model MCP servers as resource servers from an OAuth perspective." 00:09:41.560 |
I'm not quite sure where that got lost. It didn't get picked up, but in any case, we fixed this problem, and one of the reasons I'm here is to tell us all more about OAuth things that we need to pay attention to in order to avoid this problem in the future. 00:09:54.560 |
So, okay, the next attempt. In draft, all this feedback has been incorporated, and the MCP spec is kind of like fixing its issues, and the draft version of the specification models all of OAuth pretty cleanly and pretty nicely. 00:10:10.200 |
The OAuth authorization server is a totally separate entity, and this is really beneficial for all of you building MCP servers because your job gets a whole lot easier. 00:10:19.520 |
All you have to do is verify the tokens that come in over HTTP and hand off all the other responsibility to the OAuth server. 00:10:28.320 |
So, we're back to a pretty good place with respect to OAuth and MCP, and in particular how we authorize connections between MCP clients and MCP servers. 00:10:39.160 |
So, let's talk about the future. If this is all we do with OAuth, we're not even scratching the surface of what we need in order to fully secure AI and AI interactions. 00:10:48.840 |
So, what else are we going to need? We're going to burn through this here pretty quick. The first is agent-to-agent communication. 00:10:56.960 |
So, what we've seen with OAuth so far as it's applied to MCP, like I said, that's referred to as the authorization code flow, and it's particularly relevant for when we want to do end-user delegation. 00:11:06.640 |
But there's a whole bunch of other flows in OAuth that are relevant, in particular client credentials, and this applies when we want agents to communicate with other agents or other MCP servers on their own behalf, not on behalf of a user. 00:11:19.760 |
So, this is one thing to pay attention to. The next, this kind of begs the question, agent identity. What should we do about this? 00:11:28.480 |
Well, if anyone's ever done OAuth development, you're probably familiar with this type of flow, is you want to build an application, you want to integrate with an API, you go to some developer portal, create a new application, get a client ID in secret, and then somehow configure your application with those credentials. 00:11:45.440 |
So, this is a bunch of friction. This obviously won't apply well to MCP, which is trying to be a standard protocol, and you want to bring tools and agents together that may not be aware of each other. 00:11:57.160 |
You can't do this if you presuppose some sort of registration process. So, what does MCP do? Well, it picks up what is known as dynamic client registration. 00:12:08.160 |
So, what this does is allows applications and agents to request credentials at runtime rather than, like, ahead of time in manual registration. 00:12:15.880 |
So, an agent says, hey, like, this is who I am. Give me a client ID in secret. The server does it, and the agent goes about the rest of its OAuth flow. 00:12:23.880 |
Now, this specification has been around for about 10 years, and in practice has seen, like, no meaningful adoption, and one of the implications behind this is it, like, makes all agents anonymous, 00:12:36.600 |
because the registration request itself is uncredentialed. This makes it hard to build trust in agents. It's probably not super viable, in my opinion. 00:12:45.600 |
So, what should we be looking at instead? Well, there's many cases where we just want to use public clients that we don't really care about verifying their identity. 00:12:55.320 |
In this case, there's an emerging specification called pushed client registration, which introduces this kind of, like, well-known string to identify a, like, public client. 00:13:04.320 |
We can just use this well-known string, and we skip the whole registration song and dance and then the need to store the resulting state. 00:13:10.800 |
So, this is, but a lot more simpler. It also has the capability to carry certain client metadata in the request, if that's necessary. 00:13:18.440 |
So, this is something we should look in for cases where public clients apply. 00:13:24.360 |
But what about clients that we actually want to authenticate and verify their identity? 00:13:28.320 |
Well, my proposal here is that we should start looking at using URLs in PKI for identity. 00:13:36.120 |
This lets us reuse the existing identifiers that people already associate with the apps they're using, 00:13:43.600 |
This looks like, in practice, we'd have a URL, such as, you know, agent.com, to be used as a client identity in OAuth flows. 00:13:51.080 |
And then, through the magic of cryptography and key sets, we can authenticate these agents by having them sign jot assertions or HTTP message signatures 00:14:01.000 |
that we can then verify with the corresponding public keys. 00:14:04.160 |
All right. This dovetails into agent attestation. 00:14:10.720 |
We've connected our agents to the resources that we're using, but then that agent turns around and sends all that information up to an LLM. 00:14:17.000 |
This seems like something we should probably have some awareness of and control over. 00:14:20.040 |
So, in kind of protected environments, we can sort of get by, like treating the LLM as just another API, which often it is. 00:14:27.880 |
And this is a technique we could apply, but it has limited capabilities when we look at, like, edge deployed agents, such as on the desktop or mobile devices where we don't really control their software environment. 00:14:40.200 |
So, there's a bunch of interesting work going on in the IET app now with respect to, like, remote attestation and supply chain security where we can start to attest to the state of the device and the software running on it and know what LLMs our data is going to wind up in, and then incorporate that into OAuth authorization flows. 00:14:57.200 |
Next up transactional authorization. What we've done to date in OAuth is introduce scopes. This is a whole lot better than passwords, which OAuth kind of replaced back in the day, in the sense that now we can do more fine-grained permissions, such as, like, read versus write access. 00:15:16.040 |
But in practice, these end up being a little bit too coarse-grained for a lot of use cases, and oftentimes a little bit longer lived than we might like. 00:15:24.040 |
In agent interactions, we're going to have to be increasingly transactional. So, imagine use cases where you want agents to do financial transactions or commercial transactions. 00:15:35.040 |
We're going to want to authorize things on a transaction basis, potentially with specific amounts or financial budgets. So, we're going to have to look at moving to more dynamic access in this respect. 00:15:47.040 |
There's a proposal that's actually like a specification at this point called rich authorization requests, which is worth looking into, and something that we can take inspiration from or either adopt directly for these use cases. 00:16:00.040 |
Next up, we have chain of custody. This is particularly interesting to me. What we talk about with MCP really covers the first leg of this. On the left-hand side, we have authorized connections between agents and MCP servers. 00:16:15.040 |
But what happens on the right side is completely unspecified in terms of, like, the security profile. So, how do we protect an MCP server that calls another API within the same domain in particular? 00:16:27.040 |
There's a technique called OAuth token exchange that I recommend everyone look into. A special case of this is MCP servers to third-party APIs. 00:16:37.040 |
In this case, we should look into identity chaining across domains and its corresponding specification, the identity assertion grant, which lets us do cross-domain authorization in the backend. 00:16:48.040 |
Somewhat outside the scope of OAuth is other internal infrastructure that people should be aware of as they look to deploy these agents. 00:16:55.040 |
And then the culmination of this is really agent-to-agent flows, where I don't know how much of this is happening in practice today, but people see the promise of it. 00:17:03.040 |
Imagine big graphs of agents talking to other agents on other servers. We're going to need end-to-end visibility as the authorization flows along these graphs. 00:17:11.040 |
Finally, async interaction. I think one of the key things to look at here is, like, OAuth typically assumes a user is sitting in front of a browser and relatively static. 00:17:21.040 |
But as we kick off flows, users might walk away and agents do work in the background. They're going to need a way to reach out to the user and say, hey, I need a bit more access than I've been permissioned. 00:17:31.040 |
How do we think about bringing more real-time interactions via channels like SMS or push notifications rather than just browser-based flows? 00:17:39.040 |
And then a hot topic. Today, there's a bunch of interesting work going on in the voice track at the conference. 00:17:46.040 |
As AI starts to interact with us via voice and video or completely in the background, how do we think about security in those respects? 00:17:54.040 |
This is really the frontier of security and interaction, but there's a lot of prior art in various real-time communities around SIP, XMPP, WebRTC that I think is very interesting for us to all look at. 00:18:08.040 |
So, there's a lot here. Let's go build this stuff. It's all important for us to achieve a safe and secure AI future. 00:18:17.040 |
This is what we're building at Keycard. We're building an identity access management platform that lets you connect your co-pilots, custom agents, and third-party agents to all your apps, services, and infrastructure, 00:18:28.040 |
all using standards-compliant protocols, A-to-A, MCP, and OAuth. 00:18:32.040 |
If building this stuff is interesting to you, we are hiring, hiring, so get in touch with me. 00:18:37.040 |
And if it's not interesting to you, but you know you want to secure your agents, get in touch with us, too. 00:18:42.040 |
We're looking for partners that are building so that we can work with you to secure your agents. 00:18:47.040 |
The website is keycard.ai, and I will be around the rest of the conference. Thanks.