back to index

Don’t get one-shotted: Use AI to test, review, merge, and deploy code — Tomas Reimers, Graphite


Whisper Transcript | Transcript Only Page

00:00:00.560 | Hi everyone, my name is Tomas, I'm one of the co-founders of Graphite.
00:00:19.900 | Graphite is an AI code review company.
00:00:22.800 | So, to give some context on sort of where we see the industry right now and where we
00:00:27.560 | see it going, software development currently and has always had two loops.
00:00:32.460 | The inner loop, which is focused on development, and the outer loop that's focused on review.
00:00:37.540 | Developers spend time in the inner loop, they get their code working, they get the feature
00:00:41.120 | the way they want it, and then they go ahead and they move it to the outer loop where it's
00:00:44.440 | tested, reviewed, merged, deployed.
00:00:48.820 | We're seeing the inner loop change right now more than we've ever seen it.
00:00:52.160 | More developers are using AI than ever, I think right here we have some statistics from the
00:00:55.400 | GitHub developer survey.
00:00:57.080 | Nearly every developer surveyed used AI tools both inside and outside of work, and 46% of
00:01:03.860 | code on GitHub is being written by Copilot.
00:01:06.960 | We're seeing more and more code being written by AI.
00:01:10.300 | Here, we have some statistics around how code has changed over time and how some people predict
00:01:15.640 | it will change.
00:01:16.800 | And even if we take a more pessimistic view of that, we still see the way the world's going
00:01:20.620 | is just more and more and more code being written by AI.
00:01:26.940 | The inner loop is changing.
00:01:29.080 | You know?
00:01:30.080 | AI is making developers more productive.
00:01:33.360 | Developers are now producing higher volumes of code.
00:01:35.920 | But that code still needs to be reviewed.
00:01:38.540 | When we first started looking at this, when we first started building Diamond, our AI code
00:01:42.360 | reviewer about a year ago now, what we found was that we had a lot of articles that scared
00:01:47.580 | us a lot.
00:01:48.580 | We were seeing within our own organization a lot of developers adopting AI tools, but we
00:01:52.400 | were also seeing a problem.
00:01:53.820 | AI can hallucinate, it can make mistakes, and almost more scarily, it can make security vulnerabilities.
00:02:00.900 | For us, what we saw was that while the inner loop was getting sped up by AI, the outer loop
00:02:05.360 | was rapidly becoming the bottleneck.
00:02:07.900 | We were seeing tools like Cursor, Windsurf, Copilot, V0, Bolt, all of those, producing larger
00:02:14.200 | volumes of code than we were used to, than we had ever seen before.
00:02:17.500 | But we were also seeing our developers suddenly have to review higher volumes of code, test higher
00:02:21.620 | volumes of code, merge higher volumes of code, and deploy higher volumes of code.
00:02:26.640 | That's what brought us to say, there has to be a new outer loop here.
00:02:30.740 | The way that things are going, this isn't going to work, we're going to break down, we're watching
00:02:35.700 | the problems that used to only ail large companies start to ail all companies, where we were seeing
00:02:40.540 | companies deal with higher and higher and higher volumes of code.
00:02:44.700 | The requirements for the new outer loop then look a lot like the problems that larger companies
00:02:48.740 | have always had to deal with.
00:02:50.380 | You need tools to better prioritize, track, and get notified about pull requests.
00:02:54.260 | You need driver assist features to help reviewers focus and streamline the code review process.
00:02:58.700 | You need optimized CI pipelines and merge queues to be able to handle the sheer volume of
00:03:02.100 | code changes that are now happening.
00:03:03.660 | And you need better deployment tools.
00:03:09.040 | When we first started looking at this through sort of an AI-first lens, we started to see
00:03:14.380 | that, well, the problems are being created by AI, they can also probably be solved by AI.
00:03:19.220 | We can probably start to streamline a lot of these processes which have previously had been manual.
00:03:23.600 | Previously were parts of the process that developers did not enjoy, did not want to do.
00:03:28.100 | We wanted to see self-driving code review solutions where we no longer had to do those very manual
00:03:33.180 | and painful parts of review, but we could actually start to really focus on what matters most
00:03:36.860 | to the developers, making sure that your product can get out to users and that the features work
00:03:40.540 | as expected.
00:03:41.440 | We were seeing that AI-generated feedback wasn't perfect.
00:03:45.400 | And because of that, we were starting to think that bots weren't enough.
00:03:47.800 | I think an early vision of ours was, well, can we solve this by just adding AI teammates?
00:03:52.880 | Right?
00:03:52.880 | Maybe it's background agents.
00:03:53.560 | Maybe it's reviewers.
00:03:54.560 | Maybe it's a whole lot of teammates to the workflow.
00:03:56.560 | And while we think that's part of the story, we don't think that's enough.
00:04:00.640 | We think that, as we built with Diamond, that your entire tool chain has to be AI-native,
00:04:05.880 | not just your IDE.
00:04:07.380 | If you really are going to embrace AI in the age of development, if you're going to accept
00:04:10.440 | the fact that developers are going to be orders of magnitude more productive than they ever
00:04:13.540 | have before, you need tooling that reflects that.
00:04:18.120 | We started by building Diamond, so the winning AI code review platform, with high signal, low
00:04:22.580 | noise, as a deep understanding of the code base and change history.
00:04:26.340 | We summarize, prioritize, and review each change, and we integrate with your CI and your testing
00:04:30.500 | infrastructure to summarize errors and correct failures.
00:04:36.400 | Our hope with it, and what we've started to see as we've rolled it out to larger and larger
00:04:39.740 | customers and enterprises, too, is we reduce code review cycles, we enforce quality and consistency,
00:04:46.540 | and we keep your code private and secure.
00:04:49.680 | It's high signal, it's zero setup, it's actionable with one-click suggestions, and it's customizable.
00:04:54.460 | It's already being used by some of the fastest-moving companies in the world, it's expanding a lot
00:04:59.180 | more than we can even say publicly, and I hope that you all will embrace the idea that AI can
00:05:04.460 | change your entire developer workflow, not just your IDE.
00:05:08.920 | By the numbers, we see comments that our AI bot leaves to be downloaded at less than a 4% rate,
00:05:14.780 | and to be accepted, meaning integrated into the pull request that they were left on, at a
00:05:18.820 | higher rate than human comments are.
00:05:20.680 | Human comments are integrated about somewhere between 45 and 50%.
00:05:23.760 | We're watching our Diamond comments be accepted about 52%, and we've spent a lot of time tuning
00:05:27.080 | that.
00:05:28.080 | That number is actually new as of March for us.
00:05:32.680 | That's what I have to tell you around Graphite, what I have to tell you around Diamond.
00:05:36.380 | I hope you give it a shot, and thanks for having me.
00:05:38.480 | We'll see you next time.
00:05:39.960 | We'll see you next time.