MCP Is Not Good Yet — David Cramer, Sentry

Welcome everybody. This is a little bit last minute, so bear with me. If you don't know me, I started Century a long time ago, David Kramer. I'm sort of an engineer, sort of an executive, sort of a founder. I would like to think I have rational opinions, so that's mostly what this is.

I don't think you're going to learn anything here. Maybe you will. I don't know. I personally think this is not that complicated. It's just big, scary words. So if you do, great. If you don't, maybe you walk away and you're like, "Yeah, I thought that's what it was. We've done it." Mostly, I was asked a couple days ago while I snuck my way into this conference if I could fill a slot.

And so filling the slot was like, "Oh, come give some hot takes. Maybe spice it up a little bit." So that's what we're going to do. It's not going to be too much of a rant. If you know me, I like to rant, but we'll dial back for this one a little bit.

So what is an MCP? I got to say, this is one of the wildest phenomenons. It's like the new crypto wave or something. Everybody's like, "Yeah, MCP. We don't know what it is, but we're here for it." And you find a lot of these sort of opinions around how it should be, how it shouldn't be.

And what I often find is people who have these opinions have not built anything, or at least not built the thing they're talking about. I built Sentry's MCP server mostly as a fun project. So take this for what it is. It's also Sentry's MCP server. These are biased opinions towards what Sentry is.

If you're not familiar with Sentry, you probably should be, but we do application monitoring. We do a bunch of stuff. If you have bugs on the internet, they probably go to us. And so it's in context of a B2B, a SaaS business. A lot of you probably work at enterprise companies, if you will.

So think about it that way. But the way we think about MCP is it is a pluggable architecture for agents. Full stop. That's it. Pretty simple to reason about. And again, all of this is contextualized in an enterprise cloud service kind of way. There's a lot of other variations of how you might adapt MCP.

There's tool chains that make sense locally. We're talking about we run cloud services. That's most of the industry. We're B2B. We're enterprise. I think a lot of this actually still applies. But take that with a grain of salt. So how we think about MCP with Sentry, particularly because this, again, relevant here, we fix bugs.

There are things like cursor where you also fix bugs. What if we could all fix bugs things together? And so everything is contextualized in that. And I think there's this whole thing of, like, how do we be relevant? That's, like, the name of the game for every single company in the world right now.

It's like, oh, how do we become an AI company? We, too, are now an AI company. But Sentry has a lot of bugs. I fix those in my editor. I want to be cool if the bugs could be inside my editor sometimes. That's a great example of where maybe an MCP is useful, but at the very least, we're going to pretend it's useful.

So that's the context here. But it all comes back to, like, probably the reason everybody's here is, like, how do I become relevant? I've got an AI mandate. I've got infinite money to spend all of a sudden for some reason that didn't exist yesterday. How do we get involved?

Okay? So everybody, probably same stage. I know how this works. So. All right. We built this a few months ago. We are not first to market with an MCP. And the reason why is because there's two interfaces for MCPs. I'm going to focus on a remote interface, but there's also the standard I/O.

You probably learned about that or know something about that. I don't think standard I/O is super useful for businesses like ours. I'll talk about that. But sort of the analogy of why MCP is useful, and this is VS Code Insiders, which you just heard from Harold, but, like, they do a pretty good job.

They're the only ones with OWASP support that's, like, useful today. Cursor promised me end of week. I don't know how to hold them to that. But it works pretty well. You plug in Sentry's MCP. You can look up data from Sentry and a bunch of curated workflows. You can maybe fix some bugs, maybe easier than it was before, or at least more fun than it was before.

And for the sake of this, I needed a screen grab. So last night, I'm, like, literally last night, I'm working on these slides, and I go into VS Code. I'm, like, I'm just going to plug it in. I don't have time to futz around if the thing's going to break.

And so I use the VS Code. And I'm, like, okay, I'll just do a thing where it's, like, fix all my bugs for me. And then immediately it does, like, 20 API queries this Sentry. Probably cost me, like, five bucks to run this thing. But it did start fixing some bugs.

I don't know if the fixes were good, mind you. They're probably garbage. But it does the thing, right? It's, like, it brought context into the editor, which is what we want. And that context was provided by somebody else, Sentry in this case. So that is, like, one of the interesting things.

It's one of the interesting things we think about and why MCP is, like, valuable to sort of a traditional, I don't know, we're kind of an enterprise company. But, like, like, every company in the world. And it gets part of why we're all hopping on it. It's pretty accessible.

And that's what I'm going to talk about. It is actually super accessible. So this is you. This was me. And this is why I have opinions about it now. It's like, oh, it's just an API. It plugs in. We've got an API. We've got some OAuth going on. You know, we had our own OAuth provider.

You know, a lot of you might use something like a Work OS or, I don't know, pick one of these authentication services that just gives you it out of the box. If you have that, you're pretty much ready to go, which is pretty cool. It's actually, like, a pretty low boilerplate implementation.

But then you quickly learn that it's actually not that easy. And so first you kind of go into this OAuth, like, dance. And you're like, oh, okay. Like, yeah, we're going to do this. But it needs OAuth 2.1. And nobody in the world supports this thing. Like, it's like, I don't know how old it is, but I had never heard of it before MCP.

And so there's a little bit of complexity there. But you're like, okay, it's almost there. It's OAuth. We've got that. We can plug it into our API. You kind of get it working. In our case, we use Cloudflare Shim, which basically lets us proxy our OAuth 2 API on top of Cloudflare workers, which has a 2.1 client registration thing.

And I don't know if anybody's talked about that. TLDR, it's complicated. But it's not that complicated. This was built in a couple days, mind you. And I'm also an executive at the company. So it's like, yeah, if I can do it, everybody can do it. But you go through the OAuth flow, and then you're like, cool, but the robots don't know actually how to reason about giant JSON payloads that were not built for them.

And this is actually where I think a lot of people break down. There was like a big conversation. This is sort of one of my first opinions, if you will, what I might call common sense, is that MCP is not a thing that just sits on top of open API.

Like you cannot just be like, I got an API. I'm going to expose all those endpoints as tools. You're going to get the worst results you can possibly imagine. You're going to be like, oh, this doesn't make any sense. You have to massage everything. You have to design around the system.

But like generally speaking, and I'll talk a little bit about this, like you need to really think about how would you use an agent today? How do the models react to what you do when you provide them context, which is what this really is for, and design a system around that.

So it might leverage your API. It is not your API. And then you get past that, and you're wired up to things like Cursor and VS Code, and you're like, why is this breaking all the time? You can't solve for that one. It's just, you got to wait for everybody to catch up.

They're almost there. You know, a handful of clients support native authentication now. They're kind of stable. To VS Code's credit, it hasn't broken much recently. Cursor's broken quite a lot on me, but they're both great. Don't get me wrong. Cloud has support. Cloud Code has sort of support, but not really.

So I guess it might work, it might not. I think particularly in the developer ecosystem, we're much more ahead of the curve. And so if you're trying to adapt your services to third-party agents that are in our ecosystem, like these editors, you've probably got a good shot of it working tomorrow.

If, I don't know, it's Salesforce or something? I have no idea. So you're kind of beholden to the clients and the implementation. Because again, it's a plug-in architecture for agents. There's a lot of other use cases that are not just third parties, but that's kind of the focus. So I'm going to try to be constructive from here.

Let's see, we've got nine minutes. Just a few learnings. And I'm happy to talk more about this later. I'll be around. You'll probably, somebody in this room is going to disagree with this, but you should only care about OAuth if you're a B2B SaaS company like me. And particularly you care about OAuth with remote environments for the most part.

If you're like, how do I integrate my services into various agents? I want bugs to exist in Cursor. I want to run a cloud service, and I want to run a cloud service for the exact same reason I've always wanted to run a cloud service. Because I can iterate on it.

I can ship fast. I can dial in security. All the advantages, it turns out, are exactly the same, because technology has not changed. And so, if I were you, and you're not building something hyper-specific that is like a local device-centric thing, just focus on the remote MCP server, focus on the OAuth specification, and just don't worry about it.

The problems will solve themselves. Security will solve itself. Because there's a whole world of security problems. And the standard IO interface is filled with most of them. I'm not going to talk about that. I'm sure there's some other talks here about prompt injection, but it is like very, very, very scary.

Do not allow random MCP tools in your organization. Trust people that have earned trust. Don't download random packages off the internet. It will be a very bad time for your organization. I did mention this. Cloud desktop has, I think, full OAuth support right now in production, in GA. VS Code Insiders has it.

These are great because you just drop in the MCP URL, and it handles everything from there. Cursor, like I said, I think this week. I don't know about anybody else. I don't pay attention much beyond anybody else. And I think Cloud Code has not, at least I have not seen anything.

And then there's a bunch, like a long tail, right? So, works pretty well. There is this MCP remote package, which is how we shipped all this stuff. It works okay. I applaud early adopters for getting this out. It's not a great experience. And you'll find a lot of this is not a great user experience.

It's rough. It's beta. That's fine. That's fine. This is the biggest thing. Going back to the OpenAPI thing. You actually have to spend the calories. You can't just be like, haha, we proxied OpenAPI and expose it as tools. It's going to do nothing. And so, what the right answer here is, who knows?

Our version of this, and I'll talk a little bit about why, is like, we return Markdown. We've taken some API endpoints, and we've directly translated some of the response to Markdown. But it's intentional. It's like, I want to get a bug out of Sentry. I'm just going to give you the bare essentials.

I'm going to give it in a structured way that a human can reason about. Because generally speaking, if a human can reason about it, the language model can reason about it, because it's effectively pattern matching on language. It can kind of figure out JSON here and there, but if you actually push it, you're going to find it breaks all the time.

So just use something like Markdown. It's not scientific. I think there's a lack of science in a lot of this. It's hard. Just go with whatever works. But you have to really think about, you don't control the consumer. You don't control the model. And so you're kind of like this least common denominator thing.

And so, think about that, but you need to design the system, and you need to treat it as like, you are providing context to an agent that you don't know what the agent is doing. Right? And so that's the name of the game is context. That same thing. Sorry, here's an example of that.

I forgot what my slides were. We just give kind of a reasonable description of tools as the first version of context, which sometimes you hit token limits with all this, so there's some other challenges. We give a reasonable description of a tool with the hopes that clients figure out how to make use of this context.

So it can call the right tool. It can call it when it needs to. It can choose one tool over the other tool, which is a really, unfortunately, hard problem for it to figure out. Mostly straightforward. Errors, same thing. You've got to design the errors. They are still context, because just like a human can't figure out how to call your API, the machine also can't figure out how to call your API.

In my example, I'm like, fix all my bugs for me, and it queries like every organization in Sentry that I have access to. It queries all, it's like 20 API calls when it should have been one, even with all this context. So we are a long ways from this being great, but it's like a glimmer, right?

So, you know, in this case, it's like, oh, you didn't pass the thing? Or rather, you pass an invalid value for the thing. Give it a real human response. This is now more important than ever, because again, it's not just a sort of machine reasoning about it where you can hard code all this stuff.

It's abstract. You don't know who's reasoning about it. The biggest thing, and this is sort of leading to like my overarching view of the world, is like, you have no control, which already is a problem. You are also passing the cost on in a lot of these cases. So you actually kind of need to be mindful.

So another reason to not just be like, here's my API, I'm just going to return everything to you. Because all of a sudden, you know, that call, if you will, that tool call, that could have been a dollar, might be $10 now, because of the amount of tokens you needed.

And more importantly, it might just not work. Like early on, and I don't know if VS Code and/or OpenAI, I don't know who's to blame, fixed this, but like, there was and may still be a limit to the amount of tokens or description lengths of tools. Makes sense, right?

You want to constrain the cost of every API call, but all of a sudden, now you have problems again. So you got to be really thoughtful about this. This is going to be evolved. And I think the big thing is like, if you build one of these, it's not set and forget.

Like, we're still updating this thing every week, tweaking it here and there, trying to look at like what's happening and evolving it, right? But the biggest thing, and this is sort of my takeaway, my very, very strong belief is like, you just need to really focus on building agents.

MCP is a plugin architecture, there's a lot of value behind it, but like the inherent value of a lot of what LLMs are bringing is this sort of agent architecture, which by the way, is just a service architecture with a fancy new word on it, common sense kind of stuff, right?

And so, we've done this in Sentry. It does not work well with MCP yet, for what it's worth. There is no streaming responses for tools yet, and that's a big problem when you think about sort of this agent to agent. And I don't mean this in like the Google way, I mean in like the generalized point of view of agent to agent, but it gives you control.

And it's the same as all software. If you have control, you can be responsible for the success, for the failure. I can be responsible for the prompt that dictates how the tool is called. I can be responsible for the result from the tool. I can make many calls behind the scenes and wrap those up.

So, I just get a lot more control if I pick up the cost of that agent. I control the model even, right? And so, I think this is my big bet and I think this is where B2B is going to shine is when we start exposing agents through the MCP architecture.

Again, treating MCP as a plugin architecture. We've done that with one of ours, which is this thing, we keep renaming it, so bear with me. It's like called Seer now, but it's just like Sentry's got a lot of data on what's broken in your application. We do this thing where we do this really high quality root cause analysis that's done via an agent.

We expose that root cause analysis, mostly to our UI, to be fair. We also expose it to the MCP, but because it doesn't do streaming, we have to do like some polling check where it's like, okay, start the job and then let's check in on it a few times.

But then, because of the way agents work, it just gives up at some point. It's a little complicated, but again, beta testing, the promise is there. But when this works, I really think this is going to be the value unlock for a lot of us. Again, MCP does a lot of things.

It's an abstract protocol. But the agent analogy is really good. Aside all this is open source, you can find Sentry's MCP somewhere on the internet. You'll find it on GitHub. I should say fair source. There's some complexity there. This is what the agent looks like in the UI. Check it out if you haven't.

We'll be around. Give me feedback. I think the last thing I want to sort of part with is just like this stuff is not that hard. It's quite broken all the time, but it's not that hard. Again, I built it in two days. I got a lot of jobs to do at the company.

You can just go build it and try it out and learn. All this stuff is pretty obvious. I think the lesson we've learned at Sentry, or still our learning, I should say, everybody is scared of all this stuff because there's fancy new words for everything. But the fancy new words are just new words for the same thing.

It's just a new code of paint, right? You know, MCP is just a plug-in architecture. Agents are just services. Like the LLM calls or MCP calls, actually half of them tools are just API calls with a new response format, right? So it's pretty accessible to do all this. There's a lot of great technology that's been going on in here.

Like I said, we used a lot of Cloudflare tech. We did not use Cloudflare at all before this. And then in a couple days, we're like, cool, we can shim up a thing on workers. They've got an OAuth proxy for us. Problem solved. And this is important because we don't run WebSocket infrastructure at Sentry.

It's just not a thing we had, right? And unfortunately, the protocol requires something like that, which makes it a little bit annoying to adopt. But again, it's not that hard. It's pretty easy to adopt. Try it out. You'll probably hit a lot of bugs, but just stick with it.

I think this one will stick around. But I would really dial in the thinking around agents and how you're optimizing for context in the workflows you understand for your data. With that said, I will be around the rest of the afternoon, probably at our booth in the expo hall.

If you want to come chat, come say hi. I'm always happy to, like, rant about other things or give you my semi-informed opinions. I'm not an AI guy, to be clear. But, cool. With that, you know, thanks everybody for showing up to this talk and this wild conference, which is interesting.

And I'll call it there. I'll call it there. I'll call it there.

MCP Is Not Good Yet — David Cramer, Sentry

Transcript