back to indexUnlocking AI Powered DevOps Within Your Organization — Jon Peck, GitHub

00:00:00.000 |
Well, my name is John Peck. I am a developer advocate for GitHub. I've been a software developer since the late 90s and then developer advocacy for about the last 10 years. So it's been a fun ride. 00:00:29.000 |
AI is of course all the new hotness. I want to talk to you all today about really how you want to approach learning AI inside your organization. A lot of us are at that state where even some of us old fogies are coming around and accepting that there's a few LLMs that can do things maybe better or at least faster than us. But there's also a lot of confusion about how exactly we start approaching this and what kind of motivations 00:00:58.000 |
we can give our teams and what kind of approaches to make sure that it's efficient. These are what I'd say you're looking for kind of best case scenarios. 00:01:06.000 |
I'd like to focus on this third column here, efficiency, because we know that LLMs can pump out huge amounts of code, so what? The question is, is that actually getting you more feature points? What we're seeing in best case scenarios is up to one and a half X in how many feature points you can push out in the same period of time. I'm seeing averages more around 30% in most companies. I'm seeing a lot of improvement on the increase in successful builds. Basically, that's -- I've got an existing DevOps pipeline. I'm seeing a lot of improvement on the increase in successful builds. 00:01:33.000 |
Basically, that's -- I've got an existing DevOps pipeline. I've got existing tests and that sort of thing. And when people are using AI, those tests are clearing much more often. Basically, we're hitting the problems early on. I'm also seeing general increases in developer happiness, which is really buoying to see. Also, quick note -- you'll see in the lower right, there's a URL, gh.io/fair/unlock. These slides are all available up there. Please feel free to take pictures. But I'll keep that URL up here and there, so you can go ahead and see if you want to see what you want. 00:01:51.000 |
All right. So, first off, how do we ensure that we're actually going to use AI efficiently? A lot of people's first approach is go out, find some chat tool, throw some things into it, get responses, copy and paste it into your IDE. Terribly slow. Hard to specify context. Hard to 00:02:07.000 |
iterate in that model. Also, kind of exposed. Relatively insecure, right? So, step one, kind of obvious, but let's just say it, is make sure you're using an IDE that has your AI incorporated directly. Hopefully one that's flexible enough to use 00:02:19.000 |
drag-and-drop options for drag-and-drop options for pulling in just the right context and helping it narrow down so you're not just doing everything and looking at all the files all the time. I'll get into more specifics about what we look for in an IDE. 00:02:35.000 |
In this case, I'll get into more specifics about what we look for in an IDE. In this case, we're not just doing everything and looking at all the files all the time. 00:02:41.000 |
I'll get into more specifics about what we look for in an IDE. In this particular case, I'll be spending a decent amount of time using VS Code. That's my primary. And with GitHub Copilot, because that's, of course, my primary being an employee. 00:02:58.000 |
The other thing is that you want to make sure that you're still ensuring model choice. We may get to a point where orchestrators are good enough to always pick the right model at the right time. We're not there yet. 00:03:09.000 |
The advantage of using a combined tool that has an all-in subscription kind of thing is the ability to go and pick the right model at the right time. 00:03:19.000 |
So I'm doing something really fast. I just need quick, low-cost responses. I'm using a flash model. I'm doing deep research, and I'm trying to explore a problem space before I actually begin coding. 00:03:29.000 |
I'm pulling out a reasoning model. There's lots of different cases where you'd use these different things. All right. Now, when you're ready to jump in, 00:03:37.000 |
start with the developer as the operator interactive with the IDE. And the temptation here in a lot of cases is to start Greenfield, because it's so cool and fun. 00:03:47.000 |
Hey, I hop into an AI. I say, hey, I want to build Minecraft, except I want it to be using the Doom engine, and I want it to run on my dishwasher. 00:03:58.000 |
Okay, cool. So I can do that, but it's not really teaching me a lot about how to interact with the AI, and it's also not what I'm doing on a daily basis. 00:04:07.000 |
Most of the time, we're really working brownfield. We're working on modifying an existing application with a decent amount of code already in it. 00:04:17.000 |
So it's important for us to get the patterns we need as we start doing that work. So I can jump into something. I can say, hey, I want to do something like I want to swap out my ORM 00:04:27.000 |
from this one version to another, or I want to create tests for the things that I've got already, and I want to use this data source to do it, and that sort of thing. 00:04:35.000 |
And that way we start to learn effectively the right pairings of how we create those prompts and how we shape the context that we're handing to the model so it's doing the right work at the right time. 00:04:47.000 |
So really important to get people invested in that direct daily interaction and how to create and shape those prompts and find the right places to work. 00:04:57.000 |
All right. Then you might want to go back and you might want to play around with greenfield development now that you have those practices and say, okay, how fast can I create something new? 00:05:07.000 |
A big step in improvement with that and also with the brownfield work is the existence now of agent mode, right? 00:05:15.000 |
So co-pilot agent mode. Others have this sort of thing, too. And what you'll find with agent mode, if you're working in a greenfield context, 00:05:23.000 |
you can use it to sort of iterate on what the initial scoping should be, get it to help you create your prompt before you actually send the full prompt. 00:05:32.000 |
The way that I do this is I'll open up a readme markdown or something like that, and then I'll say to the LLM, hey, here's the kind of application I want. 00:05:40.000 |
What are the kinds of specs I should create? It'll spit something out, which is usually nonsense. I'll refine it. I'll be like, no, I really want you to use these technologies. 00:05:47.000 |
I really want it to have this particular file structure in the trees. Here's an example of what an API call should look like, et cetera. 00:05:53.000 |
And when I'm then at a point where I have a well-scoped document, something I'd feel good handing to another engineer, then I can hand that to the actual model to do my build. 00:06:04.000 |
In brownfield mode, the advantage of agent mode is that it's going to be able to take a large amount of context and narrow it down so I'm not overloading my prompt, so I'm not putting too much into it. 00:06:18.000 |
That's going to have the benefit of it working correctly because it's narrowly scoped enough that it'll actually fit inside the call. 00:06:25.000 |
It's going to have the second benefit of having it not overrun because if I tell it only focus on these files on these folders, then it's not going to try and do a whole bunch of other stuff I don't want it to do. 00:06:36.000 |
So in that case, I'll do something like I was talking about with migrating the ORM. I'll pull in just the models folder. I'll pull in just the backend Python Flask app or whatever it is. 00:06:47.000 |
I might pull in a few settings files, and then I'll tell it to go ahead and do that task. 00:06:51.000 |
The first thing I should see my agent do in that case is it'll look at those and it'll say, hey, I'm further narrowing the context down. 00:06:58.000 |
Here's my plan for the kinds of actions I'm going to take, and then it's going to go and start building out each separate section of the plan. 00:07:07.000 |
I can pause it. I can give it new instructions and tell it continue from there if it's going down the wrong path. 00:07:12.000 |
Or this is something we need to get comfortable with as developers. If it's just made 500 changes but I think it's working completely the wrong way, immediately just revert all of those and try again. 00:07:23.000 |
There's not as much cost to blowing away huge amounts of code as there used to be, so I should be comfortable with just erasing and starting over. 00:07:30.000 |
Obviously, make sure you're doing your commits in between at each set point so you have a good get history of things that actually worked. 00:07:38.000 |
On that note, I'll talk about copilot instructions in a minute, but also when I'm doing my prompts, if I'm just doing it that way, telling it to keep a change log of what it's been doing can be really, really handy. 00:07:51.000 |
So I just have a file where I can see all of the edits that it made. 00:07:55.000 |
Okay, now how do we get into team mode? How do we start taking these practices and sharing them with our colleagues and making sure we're working together well as a team? 00:08:04.000 |
So first off, team practices and docs don't go away, but they get better, right? 00:08:11.000 |
Yes, I'm always going to have my standards, I'm always going to have everything from what's my committed dependency file, all the way to my manuals and best practices and playbooks of how we do things. 00:08:20.000 |
But with the addition of copilot instructions files, I can now codify this. 00:08:25.000 |
So when the AI is working with me, it's also obeying those best practices. 00:08:30.000 |
In GitHub, the instantiation of that is a file called copilot-instructions.md inside the .github folder in your repo. 00:08:39.000 |
Use that locally, but also commit that file up into your repo. 00:08:44.000 |
And then it's part of your job as, say, a lead developer or a team manager to make sure that thing continues to be revised properly. 00:08:54.000 |
So that when the team makes a decision, hey, we're going to go with this linting practice. 00:08:58.000 |
We're going to go with this way of doing our model accessors. 00:09:01.000 |
We're going to have a different kind of pattern, or we're always going to check for this specific problem. 00:09:04.000 |
It's my job to make sure that ends up in the copilot-instructions file. 00:09:09.000 |
So as they're working with AI, it's keeping all those same best practices in mind. 00:09:14.000 |
Second, find a way to make your institutional knowledge accessible to the AI. 00:09:19.000 |
In a limited sense, that can be things like providing API specifications, existing patterns, and ways of working to the context as you work. 00:09:28.000 |
If you're using Copilot Enterprise, you have the additional ability to create what are called knowledge bases inside your GitHub org. 00:09:37.000 |
These are basically collections of repositories, each with special names. 00:09:40.000 |
So I might have one or two or three repositories that are my best examples of how to write an accessible front-end, 00:09:50.000 |
I might have another, which is our best practices for using Python in certain specific machine learning cases, 00:09:57.000 |
and yet another that defines all of our internal APIs, which the LLM wouldn't know automatically unless I was provided that context. 00:10:05.000 |
I can create these knowledge bases as named items, and then when I, as an operator, am working on a piece of code, 00:10:11.000 |
I can say add the knowledge base for accessibility, add the knowledge base for my internal finance APIs or whatever it is. 00:10:18.000 |
And now the LLM is informed as I'm working and will obey those best patterns. 00:10:25.000 |
Also, this is a subtle thing, it's a human thing, but make it okay to be wrong. 00:10:31.000 |
Make it okay to ask questions and admit your failures and bring them to the team. 00:10:40.000 |
And if we try to be perfect and always wait until we've got everything right to push it out, 00:10:45.000 |
we're not learning and we're not helping the team grow. 00:10:47.000 |
If I find a flaw, I find a way that a particular model is working wrong, 00:10:52.000 |
or I'm having a problem I just can't solve in a particular context but I think someone else might have on my team, 00:11:00.000 |
And then, after you have solved that problem, take that knowledge that was generated, 00:11:06.000 |
put it into a centralized guide that everybody can access. 00:11:09.000 |
Maybe in the repo, maybe in a centralized team knowledge store. 00:11:12.000 |
So there's a history there of what problems we found, what we solved, 00:11:17.000 |
and other developers can quickly reference those. 00:11:20.000 |
Lastly, just keep an eye on your team and how they're using their AI tool use, right? 00:11:25.000 |
In GitHub world, this surfaces as the co-pilot metrics APIs, 00:11:29.000 |
but you're going to look for other things as well. 00:11:31.000 |
You know, whether you've got internal tracking on usage, 00:11:34.000 |
whether you're looking at volume of code and the type of code submitted, that sort of thing. 00:11:38.000 |
You want to make sure that people who just don't happen to come across the right tool, 00:11:43.000 |
come across the right pattern, or if you're feeling a little bit hesitant, 00:11:47.000 |
are not falling behind the rest of your team. 00:11:49.000 |
So it's important to keep track of that so you can help those people who might not have had access. 00:11:56.000 |
Now we can talk a little bit about governance at a high level, right? 00:12:00.000 |
These are further concerns you have inside your organization. 00:12:06.000 |
Make sure whatever tool you're using is actually going through proxies, 00:12:13.000 |
So what you're throwing out there is not being reused in other contexts, 00:12:18.000 |
reused for training, get guarantees from that provider. 00:12:21.000 |
GitHub itself, for example, will actually provide an indemnification clause in its enterprises, 00:12:33.000 |
we're not retraining based on your data or anything like that. 00:12:36.000 |
Every prediction is getting trashed in its memory context. 00:12:39.000 |
Also make sure that you've got the option to do general high-level governance. 00:12:43.000 |
So for example, we can say at a repository level, 00:12:46.000 |
always exclude these items from the co-pilot context. 00:12:50.000 |
Then it's not accidentally leaking in environment secrets and that sort of thing 00:12:58.000 |
And then lastly, make sure that you have org-wide policy governance 00:13:01.000 |
so that every single maintainer of every repository doesn't have a huge overhead. 00:13:06.000 |
that you can say at an org level, these are the sets of policies we want 00:13:10.000 |
in terms of new edge features, in terms of which models people can access, etc., 00:13:15.000 |
and quickly get your wide governance under control. 00:13:23.000 |
The other thing I want to remind folks is that while it's really cool and obvious 00:13:27.000 |
to generate new actual production code inside our IDEs using AIs, 00:13:34.000 |
what we're going to find in many cases is that you get good lift there on the development side, 00:13:39.000 |
but there's a whole set of other use cases you want to be considering 00:13:43.000 |
that you're actually going to get more lift early on, right? 00:13:48.000 |
In the planning phase, I can work directly with an AI 00:13:52.000 |
to figure out what my large-scale plan is for a feature or for a new product. 00:13:57.000 |
That includes brainstorming with it about what technologies are best, 00:14:01.000 |
what are the latest cool things, what kind of infrastructure might I consider, etc. 00:14:06.000 |
So bring that in so you know you're going down the right track early. 00:14:09.000 |
On the security side, we can use these tools to quickly go through large existing code bases and say, 00:14:15.000 |
what are you seeing for potentially bad dependencies? 00:14:20.000 |
Is there something like cross-site scripting that exists inside this product? 00:14:28.000 |
Testing, I find this to be probably the number two speed gain. 00:14:34.000 |
that for a lot of us, we have a huge backlog. 00:14:37.000 |
And one of those things that gets dropped is, do you have full coverage? 00:14:43.000 |
I can quickly and easily go in to an existing code base, say, hey, 00:14:47.000 |
I want you to generate tests for these modules, drag them in. 00:14:51.000 |
And I want you to use this, pull in some JSON samples, 00:14:55.000 |
or use MCP to connect to your database as data examples for how to test that. 00:15:00.000 |
And then I want you to build me a bunch of unit tests using these frameworks. 00:15:04.000 |
And boom, usually in about 10% of the time it would normally take me to build all those tests, 00:15:09.000 |
I've got at least a functioning skeleton that I can update a little bit. 00:15:13.000 |
Deployment, most of these models are actually pretty good at infrastructure. 00:15:18.000 |
So when we get to that point where our application is ready to deploy, 00:15:21.000 |
I can just say, hey, how do I put this on whatever? 00:15:24.000 |
Generate a GitHub actions file, generate a Terraform file, et cetera, and move those out. 00:15:29.000 |
So now we're moving even beyond the basic developer scope. 00:15:31.000 |
You're thinking about, I hope, other roles in your organization that you can bring in 00:15:36.000 |
and give access to this power who won't necessarily be thinking on the day-to-day about the application, 00:15:42.000 |
And then I find that the number one speed gain and actually additional role 00:15:47.000 |
that I want to include is documenters, right? 00:15:51.000 |
Because we all get to that point, hey, I built the code. 00:15:56.000 |
And I don't really care if the end users know how to use it. 00:16:01.000 |
Just going in and spending that 15 minutes of, hey, document what this application does. 00:16:11.000 |
Give me examples of how I would actually interact with this. 00:16:15.000 |
If I'm using a vision-capable model, give me some graphs, or if not mermaid charts, 00:16:20.000 |
about what the actual workflows are inside the application. 00:16:24.000 |
Those are of huge benefit, possibly directly to your end developers, 00:16:28.000 |
but at the very least the people curating the documentation inside your company. 00:16:36.000 |
I've got four minutes left, so this will continue to be quick. 00:16:43.000 |
So everything I've talked about so far, semi-autonomous, right? 00:16:46.000 |
Even when I'm working with an agent in the IDE, it's doing a bunch of work, but it's pausing every now and then, 00:16:51.000 |
saying, am I allowed to access the terminal to do this? 00:16:57.000 |
But we're getting to the era now where there are some things we can move into fully autonomous modes, right? 00:17:08.000 |
You want to use automation where it makes sense, but you don't want to drop it in everywhere blindly. 00:17:12.000 |
And you want to make sure that there's always a human in the loop in some way. 00:17:16.000 |
As an example, in GitHub world, this means that when we're using autonomous agents, 00:17:21.000 |
they're always creating a new branch to work on. 00:17:24.000 |
And its execution is in a protected environment. 00:17:26.000 |
So that it's not going to mess up your main branch of code, 00:17:29.000 |
and it's not going to leak things and destroy existing environments. 00:17:32.000 |
It's going to work in a box until I say it's ready and I'm like, 00:17:45.000 |
Dead simplest audio-generating PR descriptions, for example. 00:17:51.000 |
I want to provide a good description to the people downstream. 00:17:56.000 |
I can just say, hey, Copilot, create a summary for this. 00:17:59.000 |
And it's going to give me a high level and then key file details. 00:18:07.000 |
So about six months ago, we added the ability for Copilot code review. 00:18:11.000 |
I can go into an existing pull request, no matter how it was created. 00:18:15.000 |
I can assign Copilot, set them as an assignee on that pull request the same way that I would set a human as an assignee. 00:18:23.000 |
And Copilot asynchronously is going to come along, take a look at that pull request, and it's going to start generating comments and code suggestions. 00:18:34.000 |
They'll show up, and this is our major philosophy, the same way anything a human would do. 00:18:38.000 |
We're still using that same skeleton of all the existing GitHub methodology. 00:18:42.000 |
So if another human comes along and they make a code suggestion on your PR, it shows up as something you can accept or reject. 00:18:50.000 |
It'll say, hey, I don't think that this particular function is as efficient as it could be. 00:18:57.000 |
And you get, as an operator, the choice of whether or not to do it. 00:19:00.000 |
But the benefit is it can do all of that while you're working on something else, and you just come back half an hour later and look at all its comments. 00:19:07.000 |
Kind of the same way you work asynchronously with another developer. 00:19:12.000 |
The other I would point you to is this new ability to say, all right, assign an issue directly to Copilot. 00:19:20.000 |
Now, this sounds dangerous, but again, everything happens in isolation on its own branch, and you want to be judicious. 00:19:26.000 |
You know, there's only so many giant high-scoped issues that you'd necessarily want to assign to something totally anonymous. 00:19:34.000 |
But when you are doing something like, hey, there's not enough documentation. 00:19:40.000 |
Hey, I want to add this one function that does X, Y, Z. 00:19:43.000 |
Write a well-shaped issue the same way you would write a well-shaped prompt. 00:19:54.000 |
And then take that issue and try assigning it to Copilot. 00:20:05.000 |
Provide you with what it thinks is a completed PR that's ready to go. 00:20:10.000 |
And then you get to take a look at that and say either, this was terrible. 00:20:23.000 |
Make some more modifications on the branch and present you with that set of changes. 00:20:28.000 |
In which case, boom, run my standard CI/CD pipeline. 00:20:38.000 |
So I'll leave you with this one last thing, which is MCP servers, right? 00:20:45.000 |
One is in VS Code itself, where you can go in and you can add an MCP. 00:20:50.000 |
Go to that URL, github.com/modelcontextprotocol/servers. 00:20:55.000 |
And pick ones that are applicable to your workflow. 00:21:00.000 |
An obvious one I'd point out is GitHub itself has an MCP server. 00:21:03.000 |
So now you can say things to your agent after you've added it. 00:21:06.000 |
Like, hey, okay, I love all the work you just did. 00:21:11.000 |
Go ahead and commit it with a sensible commit. 00:21:21.000 |
You don't have to flip context and come back out to GitHub and work through the web API. 00:21:25.000 |
The other thing is that this is also now available, for those of you using Copilot Enterprise, to use with the other agentic interactions I just talked about. 00:21:34.000 |
So you can actually configure your MCP at the repository level. 00:21:39.000 |
And then when you're doing something like assigning Copilot to an issue, it can take advantage of those MCP connections right there. 00:21:46.000 |
And it can reach out to other systems as it's trying to accomplish its work. 00:21:50.000 |
I'll leave you with this last slide, which is again there on that URL gh.io/fair.unlock. 00:21:57.000 |
Just a bunch of convenience links for training, cheat sheets, that sort of thing. 00:22:03.000 |
And I'll be out in the hallway if anyone wants to chat more later. 00:22:05.000 |
I'll be out in the hallway if anyone wants to chat more later.