Don’t get one-shotted: Use AI to test, review, merge, and deploy code

00:00:00.560 | Hi everyone, my name is Tomas, I'm one of the co-founders of Graphite.

00:00:19.900 | Graphite is an AI code review company.

00:00:22.800 | So, to give some context on sort of where we see the industry right now and where we

00:00:27.560 | see it going, software development currently and has always had two loops.

00:00:32.460 | The inner loop, which is focused on development, and the outer loop that's focused on review.

00:00:37.540 | Developers spend time in the inner loop, they get their code working, they get the feature

00:00:41.120 | the way they want it, and then they go ahead and they move it to the outer loop where it's

00:00:44.440 | tested, reviewed, merged, deployed.

00:00:48.820 | We're seeing the inner loop change right now more than we've ever seen it.

00:00:52.160 | More developers are using AI than ever, I think right here we have some statistics from the

00:00:55.400 | GitHub developer survey.

00:00:57.080 | Nearly every developer surveyed used AI tools both inside and outside of work, and 46% of

00:01:03.860 | code on GitHub is being written by Copilot.

00:01:06.960 | We're seeing more and more code being written by AI.

00:01:10.300 | Here, we have some statistics around how code has changed over time and how some people predict

00:01:15.640 | it will change.

00:01:16.800 | And even if we take a more pessimistic view of that, we still see the way the world's going

00:01:20.620 | is just more and more and more code being written by AI.

00:01:26.940 | The inner loop is changing.

00:01:29.080 | You know?

00:01:30.080 | AI is making developers more productive.

00:01:33.360 | Developers are now producing higher volumes of code.

00:01:35.920 | But that code still needs to be reviewed.

00:01:38.540 | When we first started looking at this, when we first started building Diamond, our AI code

00:01:42.360 | reviewer about a year ago now, what we found was that we had a lot of articles that scared

00:01:47.580 | us a lot.

00:01:48.580 | We were seeing within our own organization a lot of developers adopting AI tools, but we

00:01:52.400 | were also seeing a problem.

00:01:53.820 | AI can hallucinate, it can make mistakes, and almost more scarily, it can make security vulnerabilities.

00:02:00.900 | For us, what we saw was that while the inner loop was getting sped up by AI, the outer loop

00:02:05.360 | was rapidly becoming the bottleneck.

00:02:07.900 | We were seeing tools like Cursor, Windsurf, Copilot, V0, Bolt, all of those, producing larger

00:02:14.200 | volumes of code than we were used to, than we had ever seen before.

00:02:17.500 | But we were also seeing our developers suddenly have to review higher volumes of code, test higher

00:02:21.620 | volumes of code, merge higher volumes of code, and deploy higher volumes of code.

00:02:26.640 | That's what brought us to say, there has to be a new outer loop here.

00:02:30.740 | The way that things are going, this isn't going to work, we're going to break down, we're watching

00:02:35.700 | the problems that used to only ail large companies start to ail all companies, where we were seeing

00:02:40.540 | companies deal with higher and higher and higher volumes of code.

00:02:44.700 | The requirements for the new outer loop then look a lot like the problems that larger companies

00:02:48.740 | have always had to deal with.

00:02:50.380 | You need tools to better prioritize, track, and get notified about pull requests.

00:02:54.260 | You need driver assist features to help reviewers focus and streamline the code review process.

00:02:58.700 | You need optimized CI pipelines and merge queues to be able to handle the sheer volume of

00:03:02.100 | code changes that are now happening.

00:03:03.660 | And you need better deployment tools.

00:03:09.040 | When we first started looking at this through sort of an AI-first lens, we started to see

00:03:14.380 | that, well, the problems are being created by AI, they can also probably be solved by AI.

00:03:19.220 | We can probably start to streamline a lot of these processes which have previously had been manual.

00:03:23.600 | Previously were parts of the process that developers did not enjoy, did not want to do.

00:03:28.100 | We wanted to see self-driving code review solutions where we no longer had to do those very manual

00:03:33.180 | and painful parts of review, but we could actually start to really focus on what matters most

00:03:36.860 | to the developers, making sure that your product can get out to users and that the features work

00:03:40.540 | as expected.

00:03:41.440 | We were seeing that AI-generated feedback wasn't perfect.

00:03:45.400 | And because of that, we were starting to think that bots weren't enough.

00:03:47.800 | I think an early vision of ours was, well, can we solve this by just adding AI teammates?

00:03:52.880 | Right?

00:03:52.880 | Maybe it's background agents.

00:03:53.560 | Maybe it's reviewers.

00:03:54.560 | Maybe it's a whole lot of teammates to the workflow.

00:03:56.560 | And while we think that's part of the story, we don't think that's enough.

00:04:00.640 | We think that, as we built with Diamond, that your entire tool chain has to be AI-native,

00:04:05.880 | not just your IDE.

00:04:07.380 | If you really are going to embrace AI in the age of development, if you're going to accept

00:04:10.440 | the fact that developers are going to be orders of magnitude more productive than they ever

00:04:13.540 | have before, you need tooling that reflects that.

00:04:18.120 | We started by building Diamond, so the winning AI code review platform, with high signal, low

00:04:22.580 | noise, as a deep understanding of the code base and change history.

00:04:26.340 | We summarize, prioritize, and review each change, and we integrate with your CI and your testing

00:04:30.500 | infrastructure to summarize errors and correct failures.

00:04:36.400 | Our hope with it, and what we've started to see as we've rolled it out to larger and larger

00:04:39.740 | customers and enterprises, too, is we reduce code review cycles, we enforce quality and consistency,

00:04:46.540 | and we keep your code private and secure.

00:04:49.680 | It's high signal, it's zero setup, it's actionable with one-click suggestions, and it's customizable.

00:04:54.460 | It's already being used by some of the fastest-moving companies in the world, it's expanding a lot

00:04:59.180 | more than we can even say publicly, and I hope that you all will embrace the idea that AI can

00:05:04.460 | change your entire developer workflow, not just your IDE.

00:05:08.920 | By the numbers, we see comments that our AI bot leaves to be downloaded at less than a 4% rate,

00:05:14.780 | and to be accepted, meaning integrated into the pull request that they were left on, at a

00:05:18.820 | higher rate than human comments are.

00:05:20.680 | Human comments are integrated about somewhere between 45 and 50%.

00:05:23.760 | We're watching our Diamond comments be accepted about 52%, and we've spent a lot of time tuning

00:05:27.080 | that.

00:05:28.080 | That number is actually new as of March for us.

00:05:32.680 | That's what I have to tell you around Graphite, what I have to tell you around Diamond.

00:05:36.380 | I hope you give it a shot, and thanks for having me.

00:05:38.480 | We'll see you next time.

00:05:39.960 | We'll see you next time.

00:05:43.960 | Bye.

Don’t get one-shotted: Use AI to test, review, merge, and deploy code — Tomas Reimers, Graphite