back to indexYour Coding Agent Just Got Cloned And Your Brain Isn't Ready - Rustin Banks, Google Jules

00:00:18.680 |
and really thrilled to be here and get to speak to you today. 00:00:31.600 |
It came in the mail on 10 5 1/2 inch floppy disks. 00:00:41.800 |
hosted out of my parents' closet and salvaged computers. 00:00:44.760 |
And I just think it's ironic that when I saw AI come out, 00:00:48.360 |
I recognized the text-based interfaces perfectly 00:01:07.460 |
And this used to be state-of-the-art only two years ago. 00:01:15.720 |
And Jules is an asynchronous coding agent meant to run in the background 00:01:19.800 |
and do all those tasks that you don't want to do in parallel in the background. 00:01:26.520 |
And we launched this just two weeks ago at IO to everyone, everywhere, all at once, for free, 00:01:37.020 |
while Josh was up on the stage trying to demo other Google Labs products. 00:01:42.580 |
And so he called us and we said, "Oh, we've got to shut it down so that we can demo other products." 00:01:51.300 |
And the best part about it is to see these use cases where this is what we really want to solve. 00:01:58.040 |
We want to do the laundry, so to say, so that you can focus on the art of coding. 00:02:03.300 |
So the next time Firebase updates their SDK, Jules can do that for you. 00:02:07.520 |
Or if you just want to develop from your phone, Jules can do that for you. 00:02:11.280 |
So in the last two weeks, we've had 40,000 public commits. 00:02:14.820 |
And we're super excited what we can bring to the open source world. 00:02:19.500 |
So, but as developers, we're meant to think serially. 00:02:24.020 |
We take a task from the queue, we work on it, we go on to the next one. 00:02:30.060 |
Today, we'll learn about how to maximize parallel agents. 00:02:33.880 |
I'll try a real-world demo and we'll go through a real-world use case. 00:02:38.240 |
And then I'll go through some best practices we've learned from watching people use Jules. 00:02:44.120 |
So for this parallel process really to work well, we need to get better with AI at the beginning 00:02:53.140 |
Meaning, if it's on me to now I just have to write a bunch of tasks all day, that's not fun. 00:02:58.140 |
And if I'm reviewing PRs and handling merge messes at the end of the day, that's not going to work well either. 00:03:09.780 |
So, for example, AI can easily work through backlogs, bug reports to create tasks for you, with you, and then at the end of the SDLC, help is on the way where we can use critic agents, merging agents, that can bring everything together and make it so that this parallel workflow that we've envisioned can really come together and not drive us crazy. 00:03:36.780 |
Agents inside of our IDE are always going to be limited by our laptop. 00:03:40.780 |
And when you have these remote agents in the cloud, essentially agents as a service, they're infinitely scalable, they're always connected, and then you can develop from anywhere from any device. 00:03:51.780 |
So, you've seen two types of parallelism emerging? 00:03:54.600 |
This is the type that we expected, which is multitasking, oh, I have 10 different things on my backlog, let's do them all at once, and then we'll merge them together and test them. 00:04:06.780 |
Interestingly, you saw an example of the second type this morning with Solomon from Dagger showing how he wanted three different views of his website at the same time. 00:04:17.600 |
This was the emergent behavior we didn't expect, which is multiple variations. 00:04:21.780 |
Essentially, we see users taking a task, especially if it's a complex task, and saying, try it this way, try it that way, or give me this variation to look at, or multiple variations to look at. 00:04:37.780 |
And we can have the agents test and choose the best ones, or the user can test and choose. 00:04:43.780 |
So, for example, we see lots of people who are working on a front end task, test, and they're in a React app, and they're saying, I'm adding drag and drop. 00:04:54.780 |
Maybe try it using this library, the React beautiful drag and drop, or maybe use DnD kit, or maybe try it using the test first. 00:05:04.780 |
And in this parallel asynchronous environment, you can just spin up multiple agents at the same time. 00:05:10.780 |
They can try it, they can easily come back together, choose the best one, and you're off to the races. 00:05:24.780 |
For demo, I'm going to use the conference schedule website. 00:05:29.780 |
And SWIX, for all his skills, as you can see, has probably not spent a lot of time designing the schedule website, as you can see there. 00:05:41.780 |
Anytime there's a horizontal scroll bar, we know that's a problem. 00:05:45.780 |
But luckily, they knew that, and they said, we're just going to publish the JSON feed, and we'll let hackers hack. 00:05:51.780 |
Engineers do what we do, and let's build from it. 00:05:55.780 |
So, Pallove, who is here, built this amazing conference site where you can favorite things, you can bookmark things. 00:06:03.780 |
And this is what I use to keep track of my sessions for the conference. 00:06:08.780 |
And so, I messaged him, I said, hey, can I use, phone this and use this as an example for Jules? 00:06:17.780 |
Actually, I was sitting in my last session on my phone, and I fixed a bug using Jules. 00:06:24.780 |
So, this is how I would start something like this, is I would go into linear, and I would say, okay, first thing we need to do, we just heard Scott talk about it, is I want to add a way to know if this parallel agent is going to do a bunch of things. 00:06:40.780 |
So, first we're going to add some tests, and then I'm going to actually, I'm going to kick this one off while I'm thinking about it. 00:06:50.780 |
And then, using that idea of multiple variations, I'm going to say add it with Jest, and add it with Playwright at the same time. 00:06:59.780 |
And then we'll look at the test coverage, and we'll choose the one that has the best test coverage. 00:07:04.780 |
So, once that's done, then I can go to that other mode of parallelism, and I say, I would like a link to add a session to my Google Calendar. 00:07:10.780 |
I would like an AI summary when I click on a description. 00:07:13.780 |
And these are all features, but what I'm really excited for is for AI to do the stuff that we never seem to get to, such as accessibility audits and security audits. 00:07:25.780 |
All those things that seem to go on the backlog, but are really important, and I'm super excited for AI to do that. 00:07:31.780 |
So, we're going to also have it do an accessibility audit and improve our Lighthouse scores at the same time. 00:07:36.780 |
So, this is mostly a front-end demo because, well, I'm mostly a front-end engineer, and it's a better visual representation, but we've seen all these applied to the back-end as well. 00:07:51.780 |
We told it to add tests and ingest framework. 00:07:56.780 |
It connects to my GitHub, all my GitHub repos, and it's going to give me a plan. 00:08:03.780 |
I can see it's going to test the calendar, the search relay, the session. 00:08:18.780 |
And importantly, after it has these tests, it can run these tests so it can know when we add a new feature if it gets things right. 00:08:25.780 |
So, I'm going to fast forward a little bit here. 00:08:32.780 |
You can see all the things it's -- or all the components it's added to the tests. 00:08:39.780 |
So, now, next time that it goes to add something, it'll look at the readme and remind itself, oh, this is how I run the tests. 00:08:52.780 |
We got down to -- looks like about -- estimated test coverage looks like about 80%. 00:09:02.780 |
And then we can just choose the one we like the best. 00:09:10.780 |
So, that -- again, it's automatically integrated into GitHub. 00:09:15.780 |
And now, we can start saying, okay, now I want a calendar link. 00:09:33.780 |
Eventually, I could look at this in Jules' browser. 00:09:35.780 |
But I feel pretty confident about testing this knowing that all the tests pass. 00:09:40.780 |
Similarly, for the Gemini summaries, when I click on a description, I can get a Gemini summary. 00:09:47.780 |
I put this one in an emulator, or I emulated a mobile view, just so you can see I could have done this from my phone. 00:09:53.780 |
So, this is making accessibility audit, fixing any issues from my phone. 00:10:06.780 |
I can -- now we have this big merge we need to do. 00:10:10.780 |
And, to be honest, I ran out of time to finish the merge. 00:10:18.780 |
So, surely, Jules as a squid should help with the octopus merge. 00:10:22.780 |
But, let's just pull our checkout, our add to calendar button. 00:10:37.780 |
Let's add this to my calendar to make sure I know to come to my own talk. 00:10:44.780 |
I can now pull this back into the main branch. 00:10:48.780 |
And, now everybody at the conference has the ability to add sessions to their Google calendar. 00:10:54.780 |
Along with everything else that we saw there. 00:11:05.780 |
And, managing the parallel process in the back end. 00:11:12.780 |
So, in theory -- in summary, the secret to working in parallel is a clear definition of success. 00:11:25.780 |
How am I going to easily verify that this works? 00:11:36.780 |
And, then a robust merge and test framework at the end to put everything back together. 00:11:48.780 |
I tell it when it will know what it got right. 00:11:53.780 |
And, then I'll -- at the end I'll append a simple broad approach. 00:11:57.780 |
And, then I'll change that last line maybe two or three times depending on the complexity of the task. 00:12:02.780 |
So, for example, if I need to log this number from this web page every day, I'll say today the number is x. 00:12:09.780 |
So, log the number to the console and don't stop until the number is x. 00:12:16.780 |
I give it a helpful context like this is the search query. 00:12:22.780 |
And, then I'll clone that task because I can. 00:12:32.780 |
We're used to working on a single thing at a time. 00:12:34.780 |
Easy verification makes it so now we can work on multiple things at the same time. 00:12:40.780 |
As we saw this morning, look at different variations. 00:12:44.780 |
We can, with a parallel process, we can, we have the ability now to try things that we would never have tried before. 00:12:53.780 |
The task creation and then the merge and test part. 00:12:57.780 |
Keep using MD files or links to documentation to getting started documents. 00:13:04.780 |
And, then we tell people just throw everything in there. 00:13:07.780 |
Jules and other agents are pretty good at actually sorting out which context is important. 00:13:15.780 |
But, maybe that's just for the Gemini models.