back to indexDon’t get one-shotted: Use AI to test, review, merge, and deploy code — Tomas Reimers, Graphite

00:00:00.560 |
Hi everyone, my name is Tomas, I'm one of the co-founders of Graphite. 00:00:22.800 |
So, to give some context on sort of where we see the industry right now and where we 00:00:27.560 |
see it going, software development currently and has always had two loops. 00:00:32.460 |
The inner loop, which is focused on development, and the outer loop that's focused on review. 00:00:37.540 |
Developers spend time in the inner loop, they get their code working, they get the feature 00:00:41.120 |
the way they want it, and then they go ahead and they move it to the outer loop where it's 00:00:48.820 |
We're seeing the inner loop change right now more than we've ever seen it. 00:00:52.160 |
More developers are using AI than ever, I think right here we have some statistics from the 00:00:57.080 |
Nearly every developer surveyed used AI tools both inside and outside of work, and 46% of 00:01:06.960 |
We're seeing more and more code being written by AI. 00:01:10.300 |
Here, we have some statistics around how code has changed over time and how some people predict 00:01:16.800 |
And even if we take a more pessimistic view of that, we still see the way the world's going 00:01:20.620 |
is just more and more and more code being written by AI. 00:01:33.360 |
Developers are now producing higher volumes of code. 00:01:38.540 |
When we first started looking at this, when we first started building Diamond, our AI code 00:01:42.360 |
reviewer about a year ago now, what we found was that we had a lot of articles that scared 00:01:48.580 |
We were seeing within our own organization a lot of developers adopting AI tools, but we 00:01:53.820 |
AI can hallucinate, it can make mistakes, and almost more scarily, it can make security vulnerabilities. 00:02:00.900 |
For us, what we saw was that while the inner loop was getting sped up by AI, the outer loop 00:02:07.900 |
We were seeing tools like Cursor, Windsurf, Copilot, V0, Bolt, all of those, producing larger 00:02:14.200 |
volumes of code than we were used to, than we had ever seen before. 00:02:17.500 |
But we were also seeing our developers suddenly have to review higher volumes of code, test higher 00:02:21.620 |
volumes of code, merge higher volumes of code, and deploy higher volumes of code. 00:02:26.640 |
That's what brought us to say, there has to be a new outer loop here. 00:02:30.740 |
The way that things are going, this isn't going to work, we're going to break down, we're watching 00:02:35.700 |
the problems that used to only ail large companies start to ail all companies, where we were seeing 00:02:40.540 |
companies deal with higher and higher and higher volumes of code. 00:02:44.700 |
The requirements for the new outer loop then look a lot like the problems that larger companies 00:02:50.380 |
You need tools to better prioritize, track, and get notified about pull requests. 00:02:54.260 |
You need driver assist features to help reviewers focus and streamline the code review process. 00:02:58.700 |
You need optimized CI pipelines and merge queues to be able to handle the sheer volume of 00:03:09.040 |
When we first started looking at this through sort of an AI-first lens, we started to see 00:03:14.380 |
that, well, the problems are being created by AI, they can also probably be solved by AI. 00:03:19.220 |
We can probably start to streamline a lot of these processes which have previously had been manual. 00:03:23.600 |
Previously were parts of the process that developers did not enjoy, did not want to do. 00:03:28.100 |
We wanted to see self-driving code review solutions where we no longer had to do those very manual 00:03:33.180 |
and painful parts of review, but we could actually start to really focus on what matters most 00:03:36.860 |
to the developers, making sure that your product can get out to users and that the features work 00:03:41.440 |
We were seeing that AI-generated feedback wasn't perfect. 00:03:45.400 |
And because of that, we were starting to think that bots weren't enough. 00:03:47.800 |
I think an early vision of ours was, well, can we solve this by just adding AI teammates? 00:03:54.560 |
Maybe it's a whole lot of teammates to the workflow. 00:03:56.560 |
And while we think that's part of the story, we don't think that's enough. 00:04:00.640 |
We think that, as we built with Diamond, that your entire tool chain has to be AI-native, 00:04:07.380 |
If you really are going to embrace AI in the age of development, if you're going to accept 00:04:10.440 |
the fact that developers are going to be orders of magnitude more productive than they ever 00:04:13.540 |
have before, you need tooling that reflects that. 00:04:18.120 |
We started by building Diamond, so the winning AI code review platform, with high signal, low 00:04:22.580 |
noise, as a deep understanding of the code base and change history. 00:04:26.340 |
We summarize, prioritize, and review each change, and we integrate with your CI and your testing 00:04:30.500 |
infrastructure to summarize errors and correct failures. 00:04:36.400 |
Our hope with it, and what we've started to see as we've rolled it out to larger and larger 00:04:39.740 |
customers and enterprises, too, is we reduce code review cycles, we enforce quality and consistency, 00:04:49.680 |
It's high signal, it's zero setup, it's actionable with one-click suggestions, and it's customizable. 00:04:54.460 |
It's already being used by some of the fastest-moving companies in the world, it's expanding a lot 00:04:59.180 |
more than we can even say publicly, and I hope that you all will embrace the idea that AI can 00:05:04.460 |
change your entire developer workflow, not just your IDE. 00:05:08.920 |
By the numbers, we see comments that our AI bot leaves to be downloaded at less than a 4% rate, 00:05:14.780 |
and to be accepted, meaning integrated into the pull request that they were left on, at a 00:05:20.680 |
Human comments are integrated about somewhere between 45 and 50%. 00:05:23.760 |
We're watching our Diamond comments be accepted about 52%, and we've spent a lot of time tuning 00:05:28.080 |
That number is actually new as of March for us. 00:05:32.680 |
That's what I have to tell you around Graphite, what I have to tell you around Diamond. 00:05:36.380 |
I hope you give it a shot, and thanks for having me.