back to index

A year of Gemini progress + what comes next — Logan Kilpatrick, Google DeepMind


Whisper Transcript | Transcript Only Page

00:00:02.640 | - Awesome.
00:00:17.760 | Thank you, Ben.
00:00:18.540 | Excited for the AI Education Summit.
00:00:20.440 | Should be fun.
00:00:21.860 | My name's Logan.
00:00:22.700 | I do developer stuff at DeepMind,
00:00:24.380 | and I'm excited to talk about Gemini stuff.
00:00:26.520 | Yeah, hopefully folks know what Gemini is,
00:00:29.880 | so no introduction needed.
00:00:31.900 | I'll talk about three things really quickly.
00:00:34.120 | We'll do some fun announcement stuff.
00:00:36.160 | We'll talk about sort of recapping
00:00:37.780 | a year of progress in Gemini,
00:00:39.000 | and then we'll talk about what's coming next
00:00:40.900 | across the model side, across the Gemini app side,
00:00:44.100 | and also across, of course, the developer platform.
00:00:47.020 | So, the fun stuff, which is,
00:00:51.580 | we announced a new Gemini model today.
00:00:53.740 | So, we haven't officially announced it,
00:00:56.180 | but we'll post live on the tweet.
00:00:59.760 | New Gemini model, this is hopefully the final update
00:01:04.820 | to 2.5 Pro.
00:01:05.720 | I think folks have given us tons of feedback
00:01:07.920 | about the changes, and I think my slide has an animation,
00:01:12.820 | which is hiding all this stuff.
00:01:14.140 | But Gemini 2.5 Pro is awesome.
00:01:16.900 | It's super powerful.
00:01:18.580 | A bunch of increases across benchmarks people care about.
00:01:22.720 | It's Soda on Ader, and it's Soda on HLE, and some other benchmarks.
00:01:27.780 | I think it closes the gap on a bunch of the stuff that folks gave us feedback on
00:01:32.840 | from the previous versions of the model, so hopefully it has great performance across the board.
00:01:37.900 | It also, I think, is sort of setting the stage for the future of Gemini.
00:01:43.960 | I think 2.5 Pro for us internally, and I think in the perception from the developer ecosystem,
00:01:49.060 | was the turning point, which was super exciting.
00:01:50.960 | So it's awesome to see the momentum.
00:01:53.020 | We've got a bunch of other great models coming as well.
00:01:55.020 | So 2.5 Pro, hopefully the final version.
00:01:58.020 | Send us feedback if things don't work, and we'll continue to push the rock up the hill.
00:02:03.020 | You can go to AI.dev if you want to try it out.
00:02:05.020 | It's also available on the Gemini app and all that other stuff.
00:02:07.020 | And if you need anything, email us, and we'll make it happen.
00:02:10.020 | All right, new model launched.
00:02:13.920 | Let's talk about a year of Gemini progress.
00:02:15.920 | I think this has been the craziest thing.
00:02:17.380 | So I don't know if folks tuned in to Google I/O, but Sundar showed this slide on stage,
00:02:22.280 | which I think was a great reminder for me of just how much--
00:02:27.120 | like, it feels like 10 years of Gemini stuff packed into the last 12 months,
00:02:31.820 | which has been awesome.
00:02:33.520 | And it's actually interesting to see as well, just to sort of opine on one of the points.
00:02:38.120 | like all of these different research bets across DeepMind coming together
00:02:42.320 | to like build this incredible mainline Gemini model.
00:02:44.880 | And I think this is actually like I have a conversation with people all the time,
00:02:47.520 | but like what is-- what's the DeepMind strategy?
00:02:50.020 | What's the advantage for us building models, all that good stuff?
00:02:52.780 | And I think the interesting thing to me is just this breadth of research happening
00:02:56.460 | across like science and Gemini and all these other areas,
00:03:00.620 | like robotics and things like that.
00:03:02.720 | And all that actually ends up upstreaming into the mainline models,
00:03:05.880 | which is super exciting.
00:03:07.520 | So you see like the alpha proof and alpha geometry and a bunch of stuff
00:03:12.020 | that we did with custom models in those areas,
00:03:15.380 | actually improving the performance of our models for those domains.
00:03:18.980 | And Jack will talk about that in a little bit,
00:03:21.680 | which I'm super excited about.
00:03:24.520 | The other thing is just like not just the pace of innovation,
00:03:27.160 | but the pace of adoption.
00:03:28.680 | So I think Sundar also showed this slide,
00:03:31.120 | which was a 50x increase in the amount of AI inference that's being processed through Google servers from one year ago to last month.
00:03:41.380 | And I think that is just remarkable to see the amount of increase in demand for Gemini models,
00:03:49.620 | also from the external developer ecosystem.
00:03:52.120 | So it's been wonderful to see that happen.
00:03:54.380 | I think the other question, and I think this is like talked about a little bit,
00:04:01.080 | which is sort of what got us to this point.
00:04:04.080 | I think one of the critical pieces and like it's, you know, not super fun,
00:04:08.580 | but is worth thinking about for folks who are building companies here is like an organizational thing, truthfully.
00:04:14.880 | Like I think you bring together Google historically had lots of different teams doing lots of different AI research.
00:04:19.580 | And in late 2023, early 2023, Google brought a bunch of those teams together and sort of charted this new direction for the DeepMind team to not only just like do theoretical foundational research,
00:04:30.380 | but also to like build models and deliver them to the rest of Google and also the external world.
00:04:34.540 | And then we took the second step of that journey later earlier this year, which was actually bringing the product teams into DeepMind.
00:04:41.920 | So now DeepMind creates the models, does the research, but then also builds products and delivers those to the world.
00:04:47.580 | Then we have the Gemini app, which is our consumer product, and then we have the developer side of that with the Gemini API.
00:04:52.080 | And this has been like personally for me super fun to get to collaborate with our research team and like help actually be on the frontier with them and bring new models and capabilities to the world.
00:05:02.280 | I think this is like the collaboration that works, works incredibly well.
00:05:05.580 | Yeah, and we ship lots of stuff.
00:05:08.280 | I think this is the, this is the most fun part is there's so much stuff, so much innovation happening inside of Google.
00:05:14.280 | It's, it's incredible to get to bring that to the world and bring that to developers.
00:05:18.280 | And I think we're actually very early in that journey and as we'll, we'll see in a couple of minutes.
00:05:23.780 | So in summary, the formula is simple.
00:05:28.780 | Bring the best people together.
00:05:30.780 | Find info advantages and ship.
00:05:33.780 | I don't know if folks have played around with VO or not, but it's also been just incredible to see the reception to VO.
00:05:40.280 | It's burning all the TPUs down, which has been incredible to see lots of demand, lots of interest on the VO front.
00:05:47.280 | So hopefully folks get a chance to play around.
00:05:50.280 | It's available in the Gemini app right now.
00:05:52.280 | All right, so let's talk about what next.
00:05:54.780 | This is the fun stuff.
00:05:55.780 | So I think the, the sort of Gemini app piece is interesting just because people talk about it a lot and it's, it's a fun product and it's cool to think about.
00:06:04.780 | And also sort of, I think for folks building stuff, it's interesting to hear like what our strategy is from the app perspective.
00:06:10.780 | But the Gemini app is trying to be this universal assistant.
00:06:13.280 | And I think what that means in practice is if you, I'm sure people don't think about this all the time, but I think a lot about like what Google's products do and sort of how we show up in the world.
00:06:23.280 | And one of the interesting observations I had was that if you think about what was the thing that like brought people, individuals through all of Google's products historically.
00:06:33.280 | Like the thing that comes to mind is like, like your Google account, I guess, which like wasn't like super stateful.
00:06:38.080 | You would sort of sign into lots of different Google products with your Google account, but that didn't really do anything other than just like get you signed into that individual product.
00:06:46.580 | I think now we're seeing with Gemini that it's actually this thread that unifies all of Google.
00:06:50.780 | And I think the future for Google is going to look a lot like Gemini is this sort of, you know, thread that brings all of our stuff together, which is really interesting.
00:06:58.280 | And then hitting on all the trends, which I'm sure folks are also excited about building, I think the one that I'm most excited about is proactivity.
00:07:04.880 | I think most AI products today are still very like, you have to go and do all the work as the user.
00:07:10.380 | And I think this proactive next step of AI systems and models coming into play is going to be is going to be awesome to see.
00:07:18.880 | Yeah, and the team is moving super fast.
00:07:21.380 | If you have complaints, please do not tag me on Twitter, please tag Josh.
00:07:25.380 | He will make it happen.
00:07:26.380 | Josh is incredible.
00:07:27.380 | The Gemini app team is amazing.
00:07:29.380 | He's pushing the team super hard.
00:07:32.880 | So it's incredible to see all the progress.
00:07:35.880 | But he is the person who can make stuff happen on the Gemini app side, not me.
00:07:38.880 | So please tag him.
00:07:40.880 | From a model perspective, like again, there's so much.
00:07:44.880 | When Gemini was originally created, it was built to be a single multimodal model to do audio, image, video, et cetera.
00:07:52.880 | We've made a lot of progress on that at I/O this year.
00:07:55.380 | We announced native audio capabilities in Gemini.
00:07:58.380 | There's TTS.
00:07:59.380 | There's audio.
00:08:00.380 | You can talk to the model.
00:08:01.380 | It sounds super natural, which is awesome.
00:08:04.380 | It's powering the Astra experience.
00:08:05.880 | It's powering Gemini Live.
00:08:07.380 | So I think we're going to get towards that omnimodal model, which is awesome.
00:08:11.380 | We have VO, which is soda across a bunch of stuff.
00:08:14.880 | So hopefully we'll get video into the mainline Gemini model.
00:08:17.880 | If folks saw some of our early experiments with diffusion, which means you can get like crazy levels of tokens per second.
00:08:24.880 | Really interesting.
00:08:25.880 | That's like definitely a research exploration area, and it's not mainline yet.
00:08:31.880 | So it'll be cool to see that come.
00:08:34.880 | The agentic by default thread, I think, is something that I've been thinking a lot about recently, which is like historically, for me as a developer, I've thought about models just as this thing that gives me tokens in and out.
00:08:46.880 | And then there was lots of scaffolding in the ecosystem to allow me to build those models.
00:08:50.880 | I think it's becoming very clear to me that the models are becoming more systematic themselves, like they're doing more and more.
00:08:57.880 | And I think the reasoning step is this like really interesting place in which a lot of that's going to happen.
00:09:02.880 | And Jack's going to talk about the scaling up of reasoning.
00:09:05.880 | But I do think it'll be interesting to see like how much of the scaffolding work that's happened in the past ends up just like being a part of that reasoning step and like what that means for people who are building products and stuff like that.
00:09:17.880 | So it'll be interesting to see.
00:09:19.880 | We'll also have more small models soon, which I'm excited about, and big models.
00:09:22.880 | People want large models, which I know.
00:09:25.880 | So I'm excited about that.
00:09:27.880 | And then the last one is continuing to push the frontier on infinite context.
00:09:30.880 | I think the current model paradigm doesn't work for infinite context.
00:09:34.880 | I think it's just like impossible to scale up.
00:09:37.880 | Attention doesn't work that way.
00:09:38.880 | So I think there'll be some new innovations to hopefully help let people continue to scale up the amount of context that they're bringing in.
00:09:45.880 | And Tulsi is the person who drives all of our model stuff.
00:09:50.880 | So if you have stuff, you want to talk about Gemini models, you have ideas for things that don't work well.
00:09:56.880 | She is the person running the show on the Gemini model product side.
00:09:59.880 | And then developer stuff.
00:10:03.880 | So we have lots of things coming, which I'm excited about.
00:10:07.880 | I think I'll highlight maybe three that I think people are super excited about.
00:10:12.880 | Embeddings, I think we have, which is, you know, feels like early AI stuff, but I think it's still super important.
00:10:18.880 | Embeddings power most people's applications using RAG.
00:10:21.880 | We have a Gemini embeddings model, which is state of the art.
00:10:24.880 | So excited to be rolling that out to developers more broadly in the next couple of weeks.
00:10:28.880 | The deep research API I'm super interested in.
00:10:31.880 | There's so many interesting products that are built around this sort of research tasks.
00:10:36.880 | And people love the consumer products.
00:10:37.880 | So we're finding ways to bring a bunch of that together into a, like, bespoke deep research API, which will be awesome.
00:10:44.880 | And then VO3 and Imagine 4 in the API as well.
00:10:47.880 | So hopefully we'll see that very, very, very soon.
00:10:50.880 | And as we work to scale and make that possible from a developer platform side, I'll make one other quick comment,
00:10:57.880 | which is the AI Studio product positioning, which I also think is interesting.
00:11:03.880 | Like, AI Studio, just to be very clear, is being built as a developer platform.
00:11:08.880 | So we'll sort of move away from this, like, kind of consumer-y feel and move much more towards being a developer platform,
00:11:14.880 | which I'm personally very excited about because I think that's what developers want from us.
00:11:17.880 | So it'll be awesome to see that actually come to life with, like, many new iterations of our developer experience
00:11:23.880 | with agents built in and hopefully things like Jules and some of our developer coding agents natively in that experience,
00:11:30.880 | which will be awesome to see.
00:11:33.880 | Yeah.
00:11:34.880 | And that's what I have.
00:11:35.880 | I appreciate all the people who send lots of great feedback about Gemini stuff.
00:11:39.880 | So we'll keep pushing the rock up the hill, and I'll be around.
00:11:42.880 | So if you have more feedback, come find me, and we'll keep making Gemini great for everyone.
00:11:45.880 | So thanks, and I appreciate it.
00:11:46.880 | Thanks, and I appreciate it.
00:11:47.880 | Thank you.
00:11:48.880 | Thank you.
00:11:48.880 | Thank you.
00:11:49.880 | I appreciate it.
00:11:49.880 | Thank you.
00:11:50.880 | Thank you.
00:11:51.880 | Thank you.
00:11:51.880 | Thank you.
00:11:52.880 | We'll see you next time.