back to indexBuilding the platform for agent coordination — Tom Moor, Linear

00:00:00.640 |
So yeah, I'm Tom. I lead the engineering team at Linear, and today I would love to talk to you a 00:00:22.320 |
bit about our story with AI, how we think about AI as a company, some of the features we've built, 00:00:28.240 |
and then how we see software development going from here and perhaps Linear's place in that future. 00:00:34.080 |
So just for anybody that hasn't heard of Linear in the room that you might not be familiar, 00:00:43.600 |
so Linear is a product development tool. It's disguised as an issue tracker, we like to say. 00:00:48.560 |
We've spent the last five years obsessing over the speed, clarity, removing friction, making it just 00:00:56.800 |
the best tool for ICs to use to work every day. So yeah, it started as a simple tracker, and now we 00:01:05.360 |
think of it as an operating system for engineering and product teams to build their products. And 00:01:11.920 |
we're used by OpenAI, Ramp, Vasell, thousands of other modern software companies you've heard of 00:01:20.880 |
and use Linear to kind of keep track of their work. So just a little bit of history of our AI 00:01:29.440 |
journey, as it were. We spun up an internal Skunkworks team in early 2023, which I think was about GPT-3, 00:01:39.920 |
if I remember rightly. Our initial focus was on kind of summarization, some similarity. We were looking at 00:01:48.240 |
embeddings. Nobody on the team had any AI experience, so we're just kind of jumping in and figuring it out 00:01:54.960 |
as we go. One of the things we realized really quickly was that many of the features that we needed 00:02:00.480 |
to build needed a really solid search foundation. Almost everything you need to first find the 00:02:06.720 |
relevant stuff, right? So we had Elasticsearch at that time, and they didn't have a very good vector 00:02:11.920 |
offering. I think maybe they actually didn't have no vector offering back in 2023. So we looked around, 00:02:17.360 |
and this was kind of a moment where there was like a hundred startups suddenly came out with vector 00:02:21.840 |
databases, right? I was like, there's Pinecone, there was this, there was that. And so we looked at these, 00:02:26.480 |
we evaluated a few. They all had a ton of trade-offs. And so we literally just ended up after experimenting 00:02:32.960 |
with a bunch of things, and we had like OpenAI embeddings, and we stored them in PGVector, and we put the PGVector 00:02:38.480 |
on GCP, and it was like the most classic linear decision ever because it was so pragmatic and just used 00:02:45.600 |
the solid things. So on that base, we shipped some features, right? We shipped a V1 of 00:02:51.520 |
similar issues where we're kind of suggesting related issues. This was like, in hindsight, 00:02:56.720 |
two years later, so naive. We were just doing simple cosine embedding comparisons against the vector 00:03:06.400 |
database. And we shipped natural language filters. Actually, I think this is one of the better ones, 00:03:11.840 |
where you can just type in natural language bugs assigned to me in the last two weeks that are closed, 00:03:18.000 |
and it will produce the filter. So it's very one-shot, very naive in comparison. Yeah, but 00:03:23.280 |
pretty useful and kind of hidden, I would say, as well. We also have another feature where 00:03:28.560 |
if you create an issue from a Slack thread, we will not just pass the text from the Slack message. We will 00:03:35.200 |
we will try and produce the right issue from that automatically. And that was like so seamless 00:03:39.600 |
and hidden that I think a lot of people didn't even realize it was happening. 00:03:43.840 |
And we never shipped a co-pilot. We tried. It was like it was co-pilot season. And 00:03:48.640 |
we just the quality wasn't there. You know, we have this quality bar and it did not reach it. So 00:03:54.000 |
I don't know if it was a lack of imagination for our team because we weren't like AI-pilled enough at the 00:03:59.680 |
time, or it was like the capability of these early models. I think a bit of both, to be honest. 00:04:06.480 |
So, you know, we -- I think this was the right approach at the time, in a way. Like, 00:04:10.960 |
a lot of people on Twitter kind of noticed. They're like, oh, these are very seamless features. You're 00:04:16.640 |
not slapping AI in our face. Like, there was literally toothbrushes that said they had AI. 00:04:22.240 |
I think it's probably much worse now, to be honest. But, you know, people kind of appreciated this 00:04:26.720 |
approach of, like, small pragmatic value adds. And then, like, fast forward to 2024. And, you know, 00:04:33.760 |
we've added a few things since then. But it really feels like at 2024, the end of 2024, we hit a 00:04:39.120 |
turning point. You know, 03 coming out, the planning and reasoning models, the multimodal capabilities 00:04:46.320 |
became available in the APIs. The context windows went through the roof. You know, have, like, million 00:04:51.280 |
token contexts. Like, you can do crazy things with that. DeepSeq, of course, made a splash. And we felt 00:04:58.800 |
like some of our experiments started to become a lot less brittle. And things actually felt smart. 00:05:06.240 |
Things kind of clicked for the team a little bit more. We saw how deep this could go. 00:05:11.600 |
So the first thing we did was we started by rebuilding our search index again. Which, I don't know, 00:05:20.640 |
if you ever, like, backfilled, like, millions -- hundreds of millions of rows of embeddings, 00:05:26.000 |
it takes a while. So we moved to a hybrid search approach. This was something that 00:05:30.480 |
we had really felt was lacking over, like, the year and a half, that we kind of had PG Vector sat on its 00:05:36.240 |
own. And we weren't -- we didn't put it in our main database because it was so huge. So it was kind of 00:05:41.920 |
sat in its own thing. So we moved to Turbo Puffer. If you've not heard of Turbo Puffer, really, really cool 00:05:47.520 |
search index, I'd highly recommend giving it a look. And we moved our embeddings over to Cohere. After 00:05:52.720 |
doing kind of a comparison, we felt that they were a lot better for our domain, at least, than OpenAI's. 00:05:59.440 |
So this kind of filled a gap in the search. And this is actually just finished rolling out in, like, 00:06:06.880 |
the last two weeks because the backfill took such a while. But now we thought, okay, 00:06:12.240 |
we've got a really solid search foundation. What are we going to do with this? 00:06:15.520 |
So first thing we did is, like, we're building this feature called Product Intelligence. 00:06:20.720 |
This is basically, like, similar issues V2. So instead of just doing simple cosine matching, 00:06:27.920 |
We now have a pipeline. That pipeline is using query rewriting. It's using the hybrid search engine. 00:06:34.080 |
It's re-ranking the results. We're using deterministic rules. And then out the other side, 00:06:38.720 |
what we get is a map of relationships from any given issue to its related issues. And then how they are 00:06:46.560 |
related and the why they are related. And then what we're able to do with that is expose this in the 00:06:54.160 |
product. I hope that's clear enough as, you know, we have suggested labels, suggested assignees, 00:07:00.480 |
possible duplicates. And then on things like projects, it's like why this might be the right person to 00:07:08.960 |
work on this issue or why this might be the right project for this. So, you know, we're working with, 00:07:13.360 |
like, the open AIs of the world. They have thousands of tickets coming in. And they really have to have 00:07:17.760 |
as much help as possible to kind of churn through them and get them into the right, the hands of the 00:07:22.240 |
right engineers. I think I skipped one. Yeah. So the next one was customer feedback analysis. This is 00:07:30.320 |
something we're working on right now. So one of the other features of Linear is you can bring in all of 00:07:34.880 |
the customer feedback from all of your channels and then use that to help to decide what you're going to 00:07:41.200 |
build. And so obviously one of the steps there is, okay, we have hundreds of pieces of feedback. 00:07:46.800 |
How do we figure out what to build from this, right? So of course, LLMs are great at analyzing text. 00:07:53.600 |
And we found that I think our head of product actually said that our analysis was able to beat 00:08:01.040 |
90% of the candidates he talks to in the interview process for what they're able to do in terms of 00:08:07.200 |
analysis. So we're able to, yes, churn through hundreds or thousands of customer requests and then 00:08:12.320 |
figure out for this given project, like, how might we split this up, what features might be created from 00:08:17.680 |
this, which is pretty cool. Another feature we've already shipped is a daily or weekly pulse. This 00:08:25.760 |
synthesizes all of the updates that are happening in your workspace, creates a pulse from it, like a summarized 00:08:34.000 |
pulse. And then we also produce, like, an audio podcast version, which is pretty cool, because you 00:08:39.120 |
can pull open our mobile app and then listen to that on your commute. I hope we have an RSS feed for 00:08:44.240 |
it soon. I really want to just subscribe to it in a podcast player. So although I put podcast here, it's 00:08:49.040 |
not quite a podcast. You have to have a mobile app or the desktop app. But this is great. You just, like, 00:08:54.000 |
over breakfast, like, what has the team been up to while I was asleep. Oh, that was a -- sorry. That's the visual of it. Apologies. 00:09:03.120 |
Apologies. And then, yeah. So one other feature I'll go through here is this issue from video. So 00:09:14.240 |
literally, so many bugs come in as video recordings from customers, right? Drop the video. We'll analyze 00:09:21.920 |
it. We'll figure out the reproduction steps. And then we'll create the issue for you from that. This is 00:09:27.440 |
maybe not the finest example of the feature, but another kind of, like, seamless but very powerful and saves a bunch of time. 00:09:34.320 |
So, of course, we're baking as much into the platform as we can 00:09:40.560 |
in terms of these things. But there's a limit to that, right? We can't put in everything. We don't know. 00:09:47.520 |
Every team is different. Every team is shaped differently. So we want to make this pluggable. 00:09:52.080 |
And this is kind of where agents come in. So the way we're thinking about agents is as 00:09:56.560 |
infinitely scalable cloud-based teammates. So we launched a platform for this two weeks ago. 00:10:02.720 |
We figure, you know, we're already doing a pretty good job of orchestrating humans. 00:10:07.680 |
We are a communication tool for humans, after all. And if agents are going to be members of your team 00:10:14.080 |
going forward, then they should also live in the same place where all of the human communication happens. 00:10:20.560 |
So first, hopefully, if the internet stands up, I'm tethering. I'll do some -- I've got some videos. 00:10:27.040 |
Yeah. So CodeGen is one of the first coding agents that integrated with us. 00:10:34.080 |
So they can -- is this going to play? Cool. Yeah. So CodeGen, you can assign it. You can mention it 00:10:42.320 |
inside of Linear like any other user. And it will produce plans. It will produce PRs. 00:10:49.360 |
You can see here. It's going to pop in -- boop. This is a sped up, by the way. That took four minutes, 00:10:54.400 |
not 20 seconds. But, yes, it will produce the PR. And then you can go and review it like you would any 00:10:59.680 |
other -- any other worker -- any other team member. 00:11:03.200 |
This is really powerful, by the way. And you can -- because it's an agentic system in the background, 00:11:11.920 |
you can also interact with it from -- not just from within Linear, but from within Slack or from 00:11:17.360 |
other communication tools. And you can say, go and fix this ticket and give it a linear issue, 00:11:22.480 |
and it will know how to connect it all up. Or you'll be able to interrupt it part way. 00:11:27.040 |
Bucket is a feature flagging platform that integrated with the first version of our 00:11:34.640 |
agents platform here. Let's see. Is this going to -- oh, no. 00:11:38.160 |
All righty. Yeah. So in this case, you can just mention the bucket agent, tell it to create a flag. 00:11:45.040 |
It will create a feature flag for you. You can roll it out. You can check the status of things 00:11:50.080 |
all within here. And, of course, because it's agentic, you don't have to go command by command. 00:11:54.400 |
You can say, create a new flag, roll it out to 30% of users, and things like that. 00:12:00.000 |
And then Charlie is another coding agent with access to your repository. It's really good at creating 00:12:07.760 |
plans and doing, like, root calls analysis of bugs. So in this case, we have an issue here. It has a 00:12:15.120 |
sentry issue attached. We can just mention Charlie, ask it to do some research. So it can go and look at 00:12:23.120 |
your recent commits. It can go look through the code base. And it can kind of figure out the cause of 00:12:29.040 |
this issue. And you can imagine immediately, right, like, this has saved a lot of minutes of engineer's 00:12:35.760 |
time. They can come in here and immediately see possible causes and regression reasons for this issue. 00:12:41.200 |
So the examples I've shown so far have been kind of living in the common area of an issue. Obviously, 00:12:50.000 |
that's not quite where we want to be in the long term. So, you know, we're working on building 00:12:56.400 |
additional surfaces for this in the product so that agents aren't just, like, the same as users on the 00:13:04.400 |
team. They're kind of better because you can see what they're thinking. And I can't see what my teammates 00:13:09.200 |
are thinking a lot of the time. So, yeah. So we'll have this surface where the agents can send you their 00:13:14.800 |
observations. They can send you the tool calls. You're able to kind of go behind the scenes of the 00:13:19.840 |
agent. You'll be able to interrupt it. And then this is kind of consistent across the whole workspace, 00:13:27.360 |
right? So you have different coding agents. You have PM agents. One other company that's building an 00:13:34.640 |
integration with us right now is intercom with the Fin agent. So you'll be able to do things like just say, 00:13:39.120 |
hey, Fin, I fixed this bug. Can you go and reply to the 100 customers that reported it? And, you know, 00:13:45.840 |
how much time did that just save? So we're building this interface out right now. And I expect to have 00:13:50.640 |
it in a couple of weeks. But I've been really using these features a ton. And I've been hammering 00:13:57.520 |
this for months. And I think it really changes the game. And we'll expect kind of the amount of bugs 00:14:02.560 |
sitting in companies' backlogs, which we kind of take for granted that you have this giant backlog 00:14:06.640 |
that you're never going to get to the bottom of. I think there's just not going to be an excuse for 00:14:10.720 |
that anymore. The agents can tackle it for you. There's nothing to stop you assigning every single 00:14:18.240 |
issue in your backlog out to an agent. Have it do a first pass. Maybe 50% of them will be fixed by 00:14:22.640 |
the end of the week. So I think, yes, we're really in this world now where you can build more. You can 00:14:28.720 |
build higher quality because more of the grunt work is being done. And you can build faster. 00:14:33.600 |
How much time we got? So I'll just talk a little bit about, like, the architecture of this. 00:14:41.680 |
So, yeah, in Linear, agents are first-class users. They have identity. They have history. You can see 00:14:48.800 |
everything they do. There's a full audit trail of those events. You install them via OAuth. 00:14:54.800 |
And then once they're installed, kind of any admin on the team can manage that agent and its access. 00:15:05.760 |
So we have a very mature GraphQL API at this point, which basically enables agents to do anything in the 00:15:15.920 |
product that a human could do and granular scopes. And then we added brand new webhooks for this 00:15:22.240 |
specifically, where if you are developing an agent with Linear, you will get webhooks when events happen 00:15:29.760 |
that are specific to your agent. So somebody replies to your agent. Your agent was triggered on this issue. 00:15:36.640 |
We also added some additional scopes that you can opt into to choose whether your agent is mentionable or assignable. 00:15:45.360 |
And then, as part of that kind of future UI that I just showed, we're also working on a new SDK to be released at the same time, 00:15:52.640 |
which will just make that really, really easy where you can -- so right now you can build all this stuff. 00:15:57.760 |
It's on our existing API. And you kind of have to figure out a bit more, I would say. So we're kind of 00:16:03.360 |
building this abstraction layer, this sugar, where you can very, very easily integrate with the platform. 00:16:13.760 |
So, yeah, I'll finish with some of the best practices that we found working with these partners over the last 00:16:21.840 |
couple of months. You know, it really felt like we're kind of on the cutting edge here, and we're building it 00:16:28.720 |
as the agents themselves still haven't launched in a lot of ways. You know, like Google and Codex only just 00:16:33.760 |
launched those within the last couple of weeks. So first is to be -- to respond very quickly and very 00:16:42.560 |
precisely when folks trigger your agent. So if I mention your agent, it should respond as fast as 00:16:49.760 |
possible. A lot of what we've seen is people using emoji reactions for that right now. Excuse me, I have to cough. 00:16:55.840 |
Yeah, so -- and then respond in a way that kind of like reassures the user that you -- the agent 00:17:06.960 |
understood the request. You know, so it's like if you say at CodeGen, can you take care of this? The response 00:17:13.040 |
should be like something -- I will produce a PR for this specific thing you asked me. It's like, okay, you 00:17:17.920 |
understood what I meant. Great. Inhabit the platform. This is like linear specific a little bit, but 00:17:25.840 |
in this example, but I think it applies anywhere. We really expect that these agents are not linear 00:17:32.000 |
agents. They are the agents that live in the cloud, and one of the ways that they interact is through 00:17:37.200 |
linear, right? It's just another -- it's a window into their behavior and hopefully like a really well 00:17:42.320 |
structured one where they get a lot of context, but we really think that, you know, if you're working 00:17:48.160 |
within -- if you're working within Slack, you should use the language of those platforms and not confuse 00:17:52.800 |
things and put great effort into that. And then things like -- one of the things that we expect to 00:17:59.840 |
happen inside of linear is if you're working on an issue, you should move that issue to in progress. 00:18:04.480 |
Don't just leave it in the backlog. You expect that of your teammates, and we expect that of agents as 00:18:08.960 |
well. And then just, again, like natural behavior. So if somebody triggered you and then they replied 00:18:17.120 |
in that thread, you shouldn't need to mention the agent again to get a response. It should be a natural 00:18:22.560 |
behavior that if you reply to them, they will respond. 00:18:31.040 |
Yeah, don't be clever. Clarify your intent before acting. I think we see a lot of like attempts at 00:18:38.320 |
one shots. One pattern that we're seeing right now coming out of a lot of the coding agents is they'll 00:18:44.160 |
form a plan before doing anything and communicate that plan up front and get clarification on it. So 00:18:52.240 |
that's something that we definitely expect to happen. 00:18:55.360 |
And finally, you know, be sure you're adding value. I think, you know, LLMs, they love to just 00:19:05.520 |
produce tons of text. We don't want to see splats straight out of OpenAI into comments, into issues, 00:19:12.160 |
into any other services. Be concise. Be useful. Be like a good teammate would be. You can always fall 00:19:20.480 |
back on asking, like, what would a human do in this situation? And try your best to achieve that. 00:19:25.760 |
Cool. That's it. Thanks for listening. And if you're interested in working with us on this platform or