back to index

Your Coding Agent Just Got Cloned And Your Brain Isn't Ready - Rustin Banks, Google Jules


Whisper Transcript | Transcript Only Page

00:00:14.680 | Hi, everyone.
00:00:15.440 | I'm Rustin.
00:00:16.200 | I'm a product manager with Google Labs
00:00:18.680 | and really thrilled to be here and get to speak to you today.
00:00:21.880 | This is really like a dream come true.
00:00:25.560 | So I'm an engineer at heart.
00:00:28.480 | This is my first compiler, Borland C++ 3.1.
00:00:31.600 | It came in the mail on 10 5 1/2 inch floppy disks.
00:00:35.200 | I ordered it from AOL classifieds.
00:00:37.520 | It was amazing.
00:00:39.140 | This is my bulletin board, yeah, that I
00:00:41.800 | hosted out of my parents' closet and salvaged computers.
00:00:44.760 | And I just think it's ironic that when I saw AI come out,
00:00:48.360 | I recognized the text-based interfaces perfectly
00:00:51.520 | from hosting bulletin boards.
00:00:53.880 | And then when I saw this, like many of you,
00:00:56.960 | I dedicated my career to AI coding.
00:01:00.880 | And this is ChatGPT 3.5.
00:01:03.680 | Isn't it crazy how slow this is?
00:01:07.460 | And this used to be state-of-the-art only two years ago.
00:01:10.520 | It's pretty amazing.
00:01:12.680 | Right now, I'm a product manager for Jules.
00:01:15.720 | And Jules is an asynchronous coding agent meant to run in the background
00:01:19.800 | and do all those tasks that you don't want to do in parallel in the background.
00:01:26.520 | And we launched this just two weeks ago at IO to everyone, everywhere, all at once, for free,
00:01:37.020 | while Josh was up on the stage trying to demo other Google Labs products.
00:01:42.580 | And so he called us and we said, "Oh, we've got to shut it down so that we can demo other products."
00:01:46.900 | And luckily, we got it up and going.
00:01:48.500 | But it was a super exciting launch.
00:01:51.300 | And the best part about it is to see these use cases where this is what we really want to solve.
00:01:58.040 | We want to do the laundry, so to say, so that you can focus on the art of coding.
00:02:03.300 | So the next time Firebase updates their SDK, Jules can do that for you.
00:02:07.520 | Or if you just want to develop from your phone, Jules can do that for you.
00:02:11.280 | So in the last two weeks, we've had 40,000 public commits.
00:02:14.820 | And we're super excited what we can bring to the open source world.
00:02:19.500 | So, but as developers, we're meant to think serially.
00:02:24.020 | We take a task from the queue, we work on it, we go on to the next one.
00:02:27.700 | That's our default workflow.
00:02:30.060 | Today, we'll learn about how to maximize parallel agents.
00:02:33.880 | I'll try a real-world demo and we'll go through a real-world use case.
00:02:38.240 | And then I'll go through some best practices we've learned from watching people use Jules.
00:02:44.120 | So for this parallel process really to work well, we need to get better with AI at the beginning
00:02:50.800 | and the end of the workflow.
00:02:53.140 | Meaning, if it's on me to now I just have to write a bunch of tasks all day, that's not fun.
00:02:58.140 | And if I'm reviewing PRs and handling merge messes at the end of the day, that's not going to work well either.
00:03:03.960 | So, luckily, help is on the way.
00:03:09.780 | So, for example, AI can easily work through backlogs, bug reports to create tasks for you, with you, and then at the end of the SDLC, help is on the way where we can use critic agents, merging agents, that can bring everything together and make it so that this parallel workflow that we've envisioned can really come together and not drive us crazy.
00:03:30.960 | So, that's crazy.
00:03:33.960 | Remote agents are uniquely suited for this.
00:03:36.780 | Agents inside of our IDE are always going to be limited by our laptop.
00:03:40.780 | And when you have these remote agents in the cloud, essentially agents as a service, they're infinitely scalable, they're always connected, and then you can develop from anywhere from any device.
00:03:51.780 | So, you've seen two types of parallelism emerging?
00:03:54.600 | This is the type that we expected, which is multitasking, oh, I have 10 different things on my backlog, let's do them all at once, and then we'll merge them together and test them.
00:04:06.780 | Interestingly, you saw an example of the second type this morning with Solomon from Dagger showing how he wanted three different views of his website at the same time.
00:04:17.600 | This was the emergent behavior we didn't expect, which is multiple variations.
00:04:21.780 | Essentially, we see users taking a task, especially if it's a complex task, and saying, try it this way, try it that way, or give me this variation to look at, or multiple variations to look at.
00:04:35.780 | And then you can test and choose.
00:04:37.780 | And we can have the agents test and choose the best ones, or the user can test and choose.
00:04:43.780 | So, for example, we see lots of people who are working on a front end task, test, and they're in a React app, and they're saying, I'm adding drag and drop.
00:04:54.780 | Maybe try it using this library, the React beautiful drag and drop, or maybe use DnD kit, or maybe try it using the test first.
00:05:04.780 | And in this parallel asynchronous environment, you can just spin up multiple agents at the same time.
00:05:10.780 | They can try it, they can easily come back together, choose the best one, and you're off to the races.
00:05:16.780 | Okay, demo time.
00:05:20.780 | So, exit out of this.
00:05:24.780 | For demo, I'm going to use the conference schedule website.
00:05:29.780 | And SWIX, for all his skills, as you can see, has probably not spent a lot of time designing the schedule website, as you can see there.
00:05:41.780 | Anytime there's a horizontal scroll bar, we know that's a problem.
00:05:45.780 | But luckily, they knew that, and they said, we're just going to publish the JSON feed, and we'll let hackers hack.
00:05:51.780 | Engineers do what we do, and let's build from it.
00:05:55.780 | So, Pallove, who is here, built this amazing conference site where you can favorite things, you can bookmark things.
00:06:03.780 | And this is what I use to keep track of my sessions for the conference.
00:06:08.780 | And so, I messaged him, I said, hey, can I use, phone this and use this as an example for Jules?
00:06:15.780 | And Pallove said, oh yeah, sure.
00:06:17.780 | Actually, I was sitting in my last session on my phone, and I fixed a bug using Jules.
00:06:22.780 | So, I thought that was perfect.
00:06:24.780 | So, this is how I would start something like this, is I would go into linear, and I would say, okay, first thing we need to do, we just heard Scott talk about it, is I want to add a way to know if this parallel agent is going to do a bunch of things.
00:06:37.780 | at the same time that it's getting it right.
00:06:40.780 | So, first we're going to add some tests, and then I'm going to actually, I'm going to kick this one off while I'm thinking about it.
00:06:50.780 | And then, using that idea of multiple variations, I'm going to say add it with Jest, and add it with Playwright at the same time.
00:06:59.780 | And then we'll look at the test coverage, and we'll choose the one that has the best test coverage.
00:07:04.780 | So, once that's done, then I can go to that other mode of parallelism, and I say, I would like a link to add a session to my Google Calendar.
00:07:10.780 | I would like an AI summary when I click on a description.
00:07:13.780 | And these are all features, but what I'm really excited for is for AI to do the stuff that we never seem to get to, such as accessibility audits and security audits.
00:07:25.780 | All those things that seem to go on the backlog, but are really important, and I'm super excited for AI to do that.
00:07:31.780 | So, we're going to also have it do an accessibility audit and improve our Lighthouse scores at the same time.
00:07:36.780 | So, this is mostly a front-end demo because, well, I'm mostly a front-end engineer, and it's a better visual representation, but we've seen all these applied to the back-end as well.
00:07:49.780 | Okay.
00:07:50.780 | So, here's Jules.
00:07:51.780 | We told it to add tests and ingest framework.
00:07:56.780 | It connects to my GitHub, all my GitHub repos, and it's going to give me a plan.
00:08:02.780 | That looks about right.
00:08:03.780 | I can see it's going to test the calendar, the search relay, the session.
00:08:06.780 | That sounds great.
00:08:07.780 | I can approve the plan.
00:08:09.780 | So, Jules now has its own VM in the cloud.
00:08:13.780 | It's cloned my whole code base.
00:08:15.780 | It can run all the commands that I can run.
00:08:18.780 | And importantly, after it has these tests, it can run these tests so it can know when we add a new feature if it gets things right.
00:08:25.780 | So, I'm going to fast forward a little bit here.
00:08:28.780 | And so, this is adding just tests.
00:08:32.780 | You can see all the things it's -- or all the components it's added to the tests.
00:08:38.780 | It's added to the readme.
00:08:39.780 | So, now, next time that it goes to add something, it'll look at the readme and remind itself, oh, this is how I run the tests.
00:08:46.780 | Let's see how it did on test coverage.
00:08:51.780 | Okay.
00:08:52.780 | We got down to -- looks like about -- estimated test coverage looks like about 80%.
00:09:00.780 | So, that's pretty good.
00:09:01.780 | We can compare that with Playwright.
00:09:02.780 | And then we can just choose the one we like the best.
00:09:06.780 | We merge that into main.
00:09:08.780 | And now, we're off to the races.
00:09:10.780 | So, that -- again, it's automatically integrated into GitHub.
00:09:13.780 | We merge that into main.
00:09:15.780 | And now, we can start saying, okay, now I want a calendar link.
00:09:19.780 | So, I want a calendar button that can go in.
00:09:22.780 | And Jules will work on that.
00:09:25.780 | And then, sure enough, it ran the test.
00:09:26.780 | The test didn't pass the first time.
00:09:28.780 | It makes some changes.
00:09:30.780 | Now, the tests pass.
00:09:31.780 | And I can review this code.
00:09:33.780 | Eventually, I could look at this in Jules' browser.
00:09:35.780 | But I feel pretty confident about testing this knowing that all the tests pass.
00:09:40.780 | Similarly, for the Gemini summaries, when I click on a description, I can get a Gemini summary.
00:09:47.780 | I put this one in an emulator, or I emulated a mobile view, just so you can see I could have done this from my phone.
00:09:53.780 | So, this is making accessibility audit, fixing any issues from my phone.
00:09:58.780 | Never mind the console errors.
00:10:01.780 | Jules is going to fix those.
00:10:04.780 | And then, I can go back.
00:10:06.780 | I can -- now we have this big merge we need to do.
00:10:10.780 | And, to be honest, I ran out of time to finish the merge.
00:10:13.780 | And, Jules should help me with this merge.
00:10:16.780 | And, it's called an octopus merge.
00:10:18.780 | So, surely, Jules as a squid should help with the octopus merge.
00:10:22.780 | But, let's just pull our checkout, our add to calendar button.
00:10:26.780 | Go back to this.
00:10:29.780 | Local host.
00:10:31.780 | Refresh.
00:10:32.780 | And, now I have a calendar button.
00:10:35.780 | Let's test it.
00:10:36.780 | Okay.
00:10:37.780 | Let's add this to my calendar to make sure I know to come to my own talk.
00:10:41.780 | And, there it's on my calendar.
00:10:44.780 | I can now pull this back into the main branch.
00:10:48.780 | And, now everybody at the conference has the ability to add sessions to their Google calendar.
00:10:54.780 | Along with everything else that we saw there.
00:10:56.780 | A full test suite.
00:10:57.780 | All the accessibility audits.
00:11:00.780 | A lighthouse scores improvement.
00:11:02.780 | And, that took me all about an hour.
00:11:05.780 | And, managing the parallel process in the back end.
00:11:09.780 | Okay.
00:11:12.780 | So, in theory -- in summary, the secret to working in parallel is a clear definition of success.
00:11:20.780 | Because, nobody wants to review PRs all day.
00:11:23.780 | So, think before you get started.
00:11:25.780 | How am I going to easily verify that this works?
00:11:27.780 | Again, Scott hit on this as well.
00:11:29.780 | Create this agreement with the agent.
00:11:31.780 | Tell it.
00:11:32.780 | Don't stop until you see this.
00:11:34.780 | Or, don't stop until this works.
00:11:36.780 | And, then a robust merge and test framework at the end to put everything back together.
00:11:42.780 | And, help is coming.
00:11:44.780 | This is how I prompt for Jules.
00:11:46.780 | I give it a brief overview of the task.
00:11:48.780 | I tell it when it will know what it got right.
00:11:51.780 | Any helpful context.
00:11:53.780 | And, then I'll -- at the end I'll append a simple broad approach.
00:11:57.780 | And, then I'll change that last line maybe two or three times depending on the complexity of the task.
00:12:02.780 | So, for example, if I need to log this number from this web page every day, I'll say today the number is x.
00:12:09.780 | So, log the number to the console and don't stop until the number is x.
00:12:13.780 | That was a simple test that I wrote in.
00:12:15.780 | It'll keep going.
00:12:16.780 | I give it a helpful context like this is the search query.
00:12:20.780 | And, then I'll say use puppeteer.
00:12:22.780 | And, then I'll clone that task because I can.
00:12:25.780 | It's in the cloud.
00:12:26.780 | And, I'll say use playwright.
00:12:29.780 | So, again, have an abundance mindset.
00:12:32.780 | We're used to working on a single thing at a time.
00:12:34.780 | Easy verification makes it so now we can work on multiple things at the same time.
00:12:39.780 | Try lots of things.
00:12:40.780 | As we saw this morning, look at different variations.
00:12:44.780 | We can, with a parallel process, we can, we have the ability now to try things that we would never have tried before.
00:12:51.780 | Let AI help with those bookends.
00:12:53.780 | The task creation and then the merge and test part.
00:12:56.780 | And, context.
00:12:57.780 | Keep using MD files or links to documentation to getting started documents.
00:13:02.780 | The more context, the better.
00:13:04.780 | And, then we tell people just throw everything in there.
00:13:07.780 | Jules and other agents are pretty good at actually sorting out which context is important.
00:13:12.780 | So, more context is better at this point.
00:13:15.780 | But, maybe that's just for the Gemini models.
00:13:18.780 | Which I should have mentioned.
00:13:19.780 | Jules is powered by Gemini 2.5 Pro.
00:13:21.780 | Quick shout out.
00:13:23.780 | Thank you team Jules.
00:13:24.780 | Couldn't have done any of this without you.
00:13:26.780 | If you have any questions, you can DM me.
00:13:28.780 | I'm Rustin Banks.
00:13:29.780 | Rustin B on X.
00:13:31.780 | Thanks, everybody.
00:13:33.780 | Thank you.
00:13:34.780 | Thank you.
00:13:34.780 | Thank you.
00:13:37.840 | We'll see you next time.