Building the platform for agent coordination

So yeah, I'm Tom. I lead the engineering team at Linear, and today I would love to talk to you a bit about our story with AI, how we think about AI as a company, some of the features we've built, and then how we see software development going from here and perhaps Linear's place in that future.

So just for anybody that hasn't heard of Linear in the room that you might not be familiar, so Linear is a product development tool. It's disguised as an issue tracker, we like to say. We've spent the last five years obsessing over the speed, clarity, removing friction, making it just the best tool for ICs to use to work every day.

So yeah, it started as a simple tracker, and now we think of it as an operating system for engineering and product teams to build their products. And we're used by OpenAI, Ramp, Vasell, thousands of other modern software companies you've heard of and use Linear to kind of keep track of their work.

So just a little bit of history of our AI journey, as it were. We spun up an internal Skunkworks team in early 2023, which I think was about GPT-3, if I remember rightly. Our initial focus was on kind of summarization, some similarity. We were looking at embeddings. Nobody on the team had any AI experience, so we're just kind of jumping in and figuring it out as we go.

One of the things we realized really quickly was that many of the features that we needed to build needed a really solid search foundation. Almost everything you need to first find the relevant stuff, right? So we had Elasticsearch at that time, and they didn't have a very good vector offering.

I think maybe they actually didn't have no vector offering back in 2023. So we looked around, and this was kind of a moment where there was like a hundred startups suddenly came out with vector databases, right? I was like, there's Pinecone, there was this, there was that. And so we looked at these, we evaluated a few.

They all had a ton of trade-offs. And so we literally just ended up after experimenting with a bunch of things, and we had like OpenAI embeddings, and we stored them in PGVector, and we put the PGVector on GCP, and it was like the most classic linear decision ever because it was so pragmatic and just used the solid things.

So on that base, we shipped some features, right? We shipped a V1 of similar issues where we're kind of suggesting related issues. This was like, in hindsight, two years later, so naive. We were just doing simple cosine embedding comparisons against the vector database. And we shipped natural language filters.

Actually, I think this is one of the better ones, where you can just type in natural language bugs assigned to me in the last two weeks that are closed, and it will produce the filter. So it's very one-shot, very naive in comparison. Yeah, but pretty useful and kind of hidden, I would say, as well.

We also have another feature where if you create an issue from a Slack thread, we will not just pass the text from the Slack message. We will we will try and produce the right issue from that automatically. And that was like so seamless and hidden that I think a lot of people didn't even realize it was happening.

And we never shipped a co-pilot. We tried. It was like it was co-pilot season. And we just the quality wasn't there. You know, we have this quality bar and it did not reach it. So I don't know if it was a lack of imagination for our team because we weren't like AI-pilled enough at the time, or it was like the capability of these early models.

I think a bit of both, to be honest. So, you know, we -- I think this was the right approach at the time, in a way. Like, a lot of people on Twitter kind of noticed. They're like, oh, these are very seamless features. You're not slapping AI in our face.

Like, there was literally toothbrushes that said they had AI. I think it's probably much worse now, to be honest. But, you know, people kind of appreciated this approach of, like, small pragmatic value adds. And then, like, fast forward to 2024. And, you know, we've added a few things since then.

But it really feels like at 2024, the end of 2024, we hit a turning point. You know, 03 coming out, the planning and reasoning models, the multimodal capabilities became available in the APIs. The context windows went through the roof. You know, have, like, million token contexts. Like, you can do crazy things with that.

DeepSeq, of course, made a splash. And we felt like some of our experiments started to become a lot less brittle. And things actually felt smart. Things kind of clicked for the team a little bit more. We saw how deep this could go. So the first thing we did was we started by rebuilding our search index again.

Which, I don't know, if you ever, like, backfilled, like, millions -- hundreds of millions of rows of embeddings, it takes a while. So we moved to a hybrid search approach. This was something that we had really felt was lacking over, like, the year and a half, that we kind of had PG Vector sat on its own.

And we weren't -- we didn't put it in our main database because it was so huge. So it was kind of sat in its own thing. So we moved to Turbo Puffer. If you've not heard of Turbo Puffer, really, really cool search index, I'd highly recommend giving it a look.

And we moved our embeddings over to Cohere. After doing kind of a comparison, we felt that they were a lot better for our domain, at least, than OpenAI's. So this kind of filled a gap in the search. And this is actually just finished rolling out in, like, the last two weeks because the backfill took such a while.

But now we thought, okay, we've got a really solid search foundation. What are we going to do with this? So first thing we did is, like, we're building this feature called Product Intelligence. This is basically, like, similar issues V2. So instead of just doing simple cosine matching, We now have a pipeline.

That pipeline is using query rewriting. It's using the hybrid search engine. It's re-ranking the results. We're using deterministic rules. And then out the other side, what we get is a map of relationships from any given issue to its related issues. And then how they are related and the why they are related.

And then what we're able to do with that is expose this in the product. I hope that's clear enough as, you know, we have suggested labels, suggested assignees, possible duplicates. And then on things like projects, it's like why this might be the right person to work on this issue or why this might be the right project for this.

So, you know, we're working with, like, the open AIs of the world. They have thousands of tickets coming in. And they really have to have as much help as possible to kind of churn through them and get them into the right, the hands of the right engineers. I think I skipped one.

Yeah. So the next one was customer feedback analysis. This is something we're working on right now. So one of the other features of Linear is you can bring in all of the customer feedback from all of your channels and then use that to help to decide what you're going to build.

And so obviously one of the steps there is, okay, we have hundreds of pieces of feedback. How do we figure out what to build from this, right? So of course, LLMs are great at analyzing text. And we found that I think our head of product actually said that our analysis was able to beat 90% of the candidates he talks to in the interview process for what they're able to do in terms of analysis.

So we're able to, yes, churn through hundreds or thousands of customer requests and then figure out for this given project, like, how might we split this up, what features might be created from this, which is pretty cool. Another feature we've already shipped is a daily or weekly pulse. This synthesizes all of the updates that are happening in your workspace, creates a pulse from it, like a summarized pulse.

And then we also produce, like, an audio podcast version, which is pretty cool, because you can pull open our mobile app and then listen to that on your commute. I hope we have an RSS feed for it soon. I really want to just subscribe to it in a podcast player.

So although I put podcast here, it's not quite a podcast. You have to have a mobile app or the desktop app. But this is great. You just, like, over breakfast, like, what has the team been up to while I was asleep. Oh, that was a -- sorry. That's the visual of it.

Apologies. Apologies. And then, yeah. So one other feature I'll go through here is this issue from video. So literally, so many bugs come in as video recordings from customers, right? Drop the video. We'll analyze it. We'll figure out the reproduction steps. And then we'll create the issue for you from that.

This is maybe not the finest example of the feature, but another kind of, like, seamless but very powerful and saves a bunch of time. So, of course, we're baking as much into the platform as we can in terms of these things. But there's a limit to that, right? We can't put in everything.

We don't know. Every team is different. Every team is shaped differently. So we want to make this pluggable. And this is kind of where agents come in. So the way we're thinking about agents is as infinitely scalable cloud-based teammates. So we launched a platform for this two weeks ago.

We figure, you know, we're already doing a pretty good job of orchestrating humans. We are a communication tool for humans, after all. And if agents are going to be members of your team going forward, then they should also live in the same place where all of the human communication happens.

So first, hopefully, if the internet stands up, I'm tethering. I'll do some -- I've got some videos. Yeah. So CodeGen is one of the first coding agents that integrated with us. So they can -- is this going to play? Cool. Yeah. So CodeGen, you can assign it. You can mention it inside of Linear like any other user.

And it will produce plans. It will produce PRs. You can see here. It's going to pop in -- boop. This is a sped up, by the way. That took four minutes, not 20 seconds. But, yes, it will produce the PR. And then you can go and review it like you would any other -- any other worker -- any other team member.

This is really powerful, by the way. And you can -- because it's an agentic system in the background, you can also interact with it from -- not just from within Linear, but from within Slack or from other communication tools. And you can say, go and fix this ticket and give it a linear issue, and it will know how to connect it all up.

Or you'll be able to interrupt it part way. Bucket is a feature flagging platform that integrated with the first version of our agents platform here. Let's see. Is this going to -- oh, no. All righty. Yeah. So in this case, you can just mention the bucket agent, tell it to create a flag.

It will create a feature flag for you. You can roll it out. You can check the status of things all within here. And, of course, because it's agentic, you don't have to go command by command. You can say, create a new flag, roll it out to 30% of users, and things like that.

And then Charlie is another coding agent with access to your repository. It's really good at creating plans and doing, like, root calls analysis of bugs. So in this case, we have an issue here. It has a sentry issue attached. We can just mention Charlie, ask it to do some research.

So it can go and look at your recent commits. It can go look through the code base. And it can kind of figure out the cause of this issue. And you can imagine immediately, right, like, this has saved a lot of minutes of engineer's time. They can come in here and immediately see possible causes and regression reasons for this issue.

So the examples I've shown so far have been kind of living in the common area of an issue. Obviously, that's not quite where we want to be in the long term. So, you know, we're working on building additional surfaces for this in the product so that agents aren't just, like, the same as users on the team.

They're kind of better because you can see what they're thinking. And I can't see what my teammates are thinking a lot of the time. So, yeah. So we'll have this surface where the agents can send you their observations. They can send you the tool calls. You're able to kind of go behind the scenes of the agent.

You'll be able to interrupt it. And then this is kind of consistent across the whole workspace, right? So you have different coding agents. You have PM agents. One other company that's building an integration with us right now is intercom with the Fin agent. So you'll be able to do things like just say, hey, Fin, I fixed this bug.

Can you go and reply to the 100 customers that reported it? And, you know, how much time did that just save? So we're building this interface out right now. And I expect to have it in a couple of weeks. But I've been really using these features a ton. And I've been hammering this for months.

And I think it really changes the game. And we'll expect kind of the amount of bugs sitting in companies' backlogs, which we kind of take for granted that you have this giant backlog that you're never going to get to the bottom of. I think there's just not going to be an excuse for that anymore.

The agents can tackle it for you. There's nothing to stop you assigning every single issue in your backlog out to an agent. Have it do a first pass. Maybe 50% of them will be fixed by the end of the week. So I think, yes, we're really in this world now where you can build more.

You can build higher quality because more of the grunt work is being done. And you can build faster. How much time we got? So I'll just talk a little bit about, like, the architecture of this. So, yeah, in Linear, agents are first-class users. They have identity. They have history.

You can see everything they do. There's a full audit trail of those events. You install them via OAuth. And then once they're installed, kind of any admin on the team can manage that agent and its access. And they work fully transparently. So we have a very mature GraphQL API at this point, which basically enables agents to do anything in the product that a human could do and granular scopes.

And then we added brand new webhooks for this specifically, where if you are developing an agent with Linear, you will get webhooks when events happen that are specific to your agent. So somebody replies to your agent. Your agent was triggered on this issue. We also added some additional scopes that you can opt into to choose whether your agent is mentionable or assignable.

And then, as part of that kind of future UI that I just showed, we're also working on a new SDK to be released at the same time, which will just make that really, really easy where you can -- so right now you can build all this stuff. It's on our existing API.

And you kind of have to figure out a bit more, I would say. So we're kind of building this abstraction layer, this sugar, where you can very, very easily integrate with the platform. So, yeah, I'll finish with some of the best practices that we found working with these partners over the last couple of months.

You know, it really felt like we're kind of on the cutting edge here, and we're building it as the agents themselves still haven't launched in a lot of ways. You know, like Google and Codex only just launched those within the last couple of weeks. So first is to be -- to respond very quickly and very precisely when folks trigger your agent.

So if I mention your agent, it should respond as fast as possible. A lot of what we've seen is people using emoji reactions for that right now. Excuse me, I have to cough. Yeah, so -- and then respond in a way that kind of like reassures the user that you -- the agent understood the request.

You know, so it's like if you say at CodeGen, can you take care of this? The response should be like something -- I will produce a PR for this specific thing you asked me. It's like, okay, you understood what I meant. Great. Inhabit the platform. This is like linear specific a little bit, but in this example, but I think it applies anywhere.

We really expect that these agents are not linear agents. They are the agents that live in the cloud, and one of the ways that they interact is through linear, right? It's just another -- it's a window into their behavior and hopefully like a really well structured one where they get a lot of context, but we really think that, you know, if you're working within -- if you're working within Slack, you should use the language of those platforms and not confuse things and put great effort into that.

And then things like -- one of the things that we expect to happen inside of linear is if you're working on an issue, you should move that issue to in progress. Don't just leave it in the backlog. You expect that of your teammates, and we expect that of agents as well.

And then just, again, like natural behavior. So if somebody triggered you and then they replied in that thread, you shouldn't need to mention the agent again to get a response. It should be a natural behavior that if you reply to them, they will respond. Yeah, don't be clever. Clarify your intent before acting.

I think we see a lot of like attempts at one shots. One pattern that we're seeing right now coming out of a lot of the coding agents is they'll form a plan before doing anything and communicate that plan up front and get clarification on it. So that's something that we definitely expect to happen.

And finally, you know, be sure you're adding value. I think, you know, LLMs, they love to just produce tons of text. We don't want to see splats straight out of OpenAI into comments, into issues, into any other services. Be concise. Be useful. Be like a good teammate would be.

You can always fall back on asking, like, what would a human do in this situation? And try your best to achieve that. Cool. That's it. Thanks for listening. And if you're interested in working with us on this platform or integrating with Linnea, let me know.

Building the platform for agent coordination — Tom Moor, Linear

Transcript