back to index

Designing AI-Intensive Applications - swyx


Chapters

0:0 Conference Welcome and Overview
0:42 Conference Logistics and Growth
1:47 Audience Preferences and Survey
2:22 Innovations in AI Engineering (MCP and Chatbots)
2:58 Evolution of AI Engineering (Past Talks)
3:50 Simplicity in AI Engineering
4:17 AI Engineering as a Developing Field
5:23 Seeking the "Standard Model" in AI Engineering
6:2 Candidate Standard Models in AI Engineering
9:26 Human Input vs. AI Output (AI News Example)
11:5 SPADE Model for AI-Intensive Applications
12:29 Call to Action for Conference Attendees

Whisper Transcript | Transcript Only Page

00:00:05.000 | - Okay.
00:00:18.240 | Hi everyone, welcome to the conference, how are you doing?
00:00:20.960 | Excellent.
00:00:23.760 | Usually I open these conferences with a small little talk
00:00:26.680 | to introduce what's going on
00:00:28.600 | and then give you a little update
00:00:30.000 | on where the state of AI engineering is
00:00:31.680 | and how we put together the conference for you.
00:00:34.940 | This is one of those combined talks.
00:00:37.540 | I'm trying to answer every single question you have
00:00:39.480 | about the conference, about AI news,
00:00:42.080 | about where this is all going,
00:00:44.400 | and we'll just dive right in.
00:00:47.060 | Okay, so, 3,000 of you, all of you registered last minute.
00:00:51.680 | Thank you for that stress.
00:00:53.060 | I actually can't quantify this.
00:00:55.480 | I call this the Gini coefficient for the AI organizer stress.
00:01:01.480 | This is compared to last year.
00:01:02.820 | It is, please just buy tickets earlier.
00:01:05.180 | Like, I mean, you know you're going to come, just do it.
00:01:07.200 | Okay.
00:01:09.420 | We also like to use this conference
00:01:11.820 | as a way to track the evolution of AI engineering.
00:01:14.760 | That's, those are the tracks for last year.
00:01:16.320 | We've just doubled every single track for you.
00:01:19.620 | So, basically, it's basically, you know, like double the value
00:01:22.820 | for whatever you get here, and I think, like, you know,
00:01:26.420 | I think this is as much concurrency as we want to do.
00:01:29.080 | Like, I know, I hear that people have decision fatigue
00:01:31.740 | and all that totally, but also we try to cover all of AI,
00:01:34.540 | so deal with it.
00:01:35.740 | We also pride ourselves in doing well by being more responsive
00:01:43.120 | than other conferences like NeurIPS and being more technical
00:01:45.700 | than other conferences like TED or whatever, what have you.
00:01:49.000 | So, we asked you what you wanted to hear about.
00:01:51.180 | These are the surveys.
00:01:52.380 | We tried all sorts of things.
00:01:53.460 | We tried computer using agents.
00:01:55.400 | We tried AI and crypto.
00:01:58.120 | It's always a fun one.
00:01:59.320 | And, but you guys told us what you wanted, and we put it in there.
00:02:03.120 | For all, for more data, we would actually like you to finish
00:02:06.800 | out our survey with surveys not done.
00:02:08.300 | So, if you want to head to that URL, we will present the results
00:02:11.800 | in full tomorrow.
00:02:12.760 | We would love all of you to fill it out so we can get a representative
00:02:15.600 | sample of what you want, and they'll inform us next year.
00:02:19.000 | Okay.
00:02:21.480 | You know, I think the other thing about AI engineering is that we also
00:02:24.520 | have been innovating as engineers, right?
00:02:26.100 | We're the first conference to have an MCP.
00:02:28.140 | We're the first conference to have an MCP talk accepted by MCP.
00:02:32.600 | Shout out to Sam Julian from Ryder for working with us on the official
00:02:39.100 | chatbot, and Quinn and John from Daily for working with us on the official
00:02:43.060 | VoiceBot, as well as Elizabeth Treichen from Vappy.
00:02:45.940 | I need to give her a shout out because she originally helped us prototype
00:02:49.540 | the VoiceBot as well.
00:02:51.280 | So, we're trying to constantly improve the experience.
00:02:53.380 | The other thing I think I want to emphasize as well is, like,
00:02:57.940 | these are the talks that I give.
00:02:59.460 | In 2023, the very first AIE, I talked about the three types of AI engineer.
00:03:06.560 | In 2024, I talked about how AI engineering was becoming more multidisciplinary,
00:03:11.760 | and that's why we started the World's Fair with multiple tracks.
00:03:14.940 | In 2025, in New York, we talked about the evolution and the focus on agent engineering.
00:03:20.180 | So, where are we now in sort of June of 2025?
00:03:24.320 | That's where we're going to focus on.
00:03:25.640 | I think we've come a long way, regardless.
00:03:27.400 | Like, you know, people used to make fun of AI engineering, and I anticipated this.
00:03:30.900 | We used to be low status.
00:03:32.500 | People just derived GPT wrappers.
00:03:34.500 | And look at all the GPT wrappers now.
00:03:36.500 | All of you are rich.
00:03:38.500 | So, we're going to hear from some of these folks in the room, and thank you for sponsoring,
00:03:46.600 | as well.
00:03:47.600 | But, you know, I think the other thing that's also super interesting is that, like, you should--
00:03:51.600 | the consistent lesson that we hear is to not overcomplicate things.
00:03:54.600 | From Anthropic on the Latent Space podcast, we hear from Eric Stuntz about how they beat
00:04:00.600 | a sweep bench with just a very simple scaffold.
00:04:03.200 | Same about deep research from Greg Brockman, who you're going to hear later on in the sort
00:04:07.700 | of closing keynotes.
00:04:08.700 | As well as AMP code.
00:04:09.700 | Where's the AMP folks here?
00:04:13.700 | I think they're probably back in the other room.
00:04:14.700 | But also, you know, there's a sort of emperor has no clothes.
00:04:17.700 | Like, there's-- it's still a very early field, and I think the AI engineers in the room, like,
00:04:21.700 | should be very encouraged by that.
00:04:22.700 | Like, there's still a lot of alpha to mine.
00:04:28.200 | If you watch back all the way to the start of this conference, we actually compared this
00:04:31.500 | moment a lot to the time when sort of physics was in full bloom.
00:04:35.500 | Like, this is the Solvay conference in 1927 when Einstein, Marie Curie, and all the other
00:04:39.600 | household names in physics were all gathered together, and that's what we're trying to do
00:04:43.100 | for this conference.
00:04:44.100 | We've gathered the entire-- the best sort of AI engineers in the world and researchers
00:04:49.400 | and all that to build and push the frontier forward.
00:04:54.700 | The thesis is that there's-- this is the time.
00:04:56.500 | This is the right time to do it.
00:04:58.000 | I said that two and a half years ago, still true today.
00:05:01.600 | But I think, like, there's a very specific time when, like, basically, what people did
00:05:05.800 | in that time of the formation of an industry is that they set out all the basic ideas that
00:05:10.300 | then lasted for the rest of that industry.
00:05:12.700 | So this is the standard model in physics, and there was a very specific period in time
00:05:16.100 | from, like, the '40s to the '70s where they figured it all out, and the next 50 years
00:05:20.400 | we haven't really changed the standard model.
00:05:22.400 | So the question that I want to phrase here is, what is the standard model in AI engineering?
00:05:27.700 | Right?
00:05:28.700 | We have standard models in the rest of engineering, right?
00:05:30.400 | Everyone knows ETL.
00:05:32.200 | Everyone knows MVC.
00:05:33.400 | Everyone knows CRUD.
00:05:34.400 | Everyone knows MapReduce.
00:05:36.400 | And I've used those things in, like, building AI applications.
00:05:39.700 | And, like, it's pretty much like, yes, RAG is there, but I heard RAG is dead.
00:05:43.700 | I don't know.
00:05:44.700 | You guys can tell me.
00:05:46.500 | This day is, like, long context-filled RAG.
00:05:48.500 | The other day, fine-tuning kills RAG.
00:05:50.300 | I don't know.
00:05:51.000 | But I don't think-- I definitely don't think it's the full answer.
00:05:53.500 | So what other standard models might emerge to help us guide our thinking?
00:05:57.700 | And that's really what I want to push you guys to.
00:05:59.800 | So there are a few candidates, standard models in AI engineering.
00:06:02.700 | I'll pick out a few of these.
00:06:03.800 | I don't have time to talk about all of them.
00:06:05.600 | But definitely listen to the DSPy talk from Omar later, tomorrow.
00:06:10.600 | So we're going to cover a few of these.
00:06:13.200 | So first is the LMOS.
00:06:15.100 | This is one of the earliest standard models, basically,
00:06:18.400 | from Karpavi in 2023.
00:06:20.560 | I have updated it for 2025 for multimodality,
00:06:24.160 | for the standard set of tools that have come out,
00:06:27.340 | as well as MCP, which has become the default protocol
00:06:31.900 | for connecting with the outside world.
00:06:35.100 | Second one would be the LNSDLC, Software Development Lifecycle.
00:06:39.600 | I have two versions of this, one with the intersecting concerns
00:06:42.800 | of all the tooling that you buy.
00:06:44.000 | By the way, this is all on the Latent Space blog, if you want.
00:06:46.400 | And I'll tweet out the slides, so--
00:06:49.300 | and it's live stream, so whatever.
00:06:52.140 | But I think, for me, the most interesting insight
00:06:54.300 | in the aha moment when I was talking to Ankur of Braintrust,
00:06:57.700 | who is going to be keynoting tomorrow, is that, you know,
00:07:01.100 | the early parts of the SDLC are increasingly commodity, right?
00:07:04.900 | LLM's kind of free, you know, monitoring kind of free,
00:07:09.800 | and RAG kind of free.
00:07:11.300 | Obviously, there's-- it's just free tier for all of them,
00:07:13.300 | and you only get-- start paying.
00:07:14.800 | But like, when you start to make real money from your customers,
00:07:18.200 | it's when you start to do evals, and you start to add in security,
00:07:20.900 | orchestration, and do real work.
00:07:22.300 | That is real hard engineering work.
00:07:24.300 | And I think that's-- those are the tracks that we've added this year.
00:07:27.300 | And I'm very proud to, you know, I guess, push AI engineering along
00:07:30.800 | from demos into production, which is what everyone always wants.
00:07:34.400 | Another form of standard model is building effective agents.
00:07:37.400 | Our last conference, we had Barry, one of the co-authors
00:07:40.500 | of Building Effective Agents from Anthropic,
00:07:42.000 | give an extremely popular talk about this.
00:07:44.600 | I think that this is now at least the received wisdom
00:07:48.100 | for how to build an agent.
00:07:49.700 | And I think, like, that is one definition.
00:07:52.900 | OpenAI has a different definition.
00:07:54.500 | And I think we're just going to continue to iterate.
00:07:56.700 | I think Dominic yesterday released another improvement
00:07:59.500 | on the agents SDK, which builds upon the Swarm concept
00:08:02.500 | that the OpenAI is pushing.
00:08:03.400 | The way that I approach sort of the agent standard model
00:08:09.200 | has been very different.
00:08:10.200 | So you can refer to my talk from the previous conference on that.
00:08:13.100 | I'm basically trying to do a descriptive top-down model
00:08:18.700 | of what people use, the words people use to describe agents,
00:08:22.500 | like intent, you know, control flow, memory planning, and tool use.
00:08:28.900 | So there's all these, like, really, really interesting things.
00:08:31.600 | But I think that the thing that really got me is, like,
00:08:34.800 | I don't actually use all of that to build AI news.
00:08:36.900 | By the way, who here reads AI news?
00:08:38.200 | I don't know if there's, like-- yeah?
00:08:39.500 | Oh, my god, like, there's half of you.
00:08:41.200 | Thanks.
00:08:42.300 | It's a really good tool I built for myself.
00:08:44.800 | And, you know, hopefully now over 70,000 people
00:08:48.000 | are reading along as well.
00:08:49.500 | And the thing that really got me was Sumith.
00:08:52.700 | At the last conference, you know, he's the lead of PyTorch.
00:08:55.300 | And he says he reads AI news, he loves it, but it is not an agent.
00:08:58.000 | And I was like, what do you mean it's not an agent?
00:08:59.500 | I call it an agent, and you should call it an agent.
00:09:01.900 | But he's right.
00:09:03.000 | It's actually-- it's actually--
00:09:06.100 | I'm going to talk a little bit about that.
00:09:07.400 | But, like, why does it still deliver value,
00:09:10.100 | even though it's, like, a workflow?
00:09:11.400 | And, like, you know, is that still interesting to people?
00:09:13.300 | Right?
00:09:13.800 | Like, why do we not brand every single track here--
00:09:16.800 | voice agents, you know, like, workflow agents,
00:09:21.300 | computer use agents?
00:09:22.200 | Like, why is every single track in this conference not an agent?
00:09:25.100 | Well, I think, basically, we want to deliver value
00:09:28.800 | instead of arguable terminology.
00:09:30.600 | So the assertion that I have is that it's really about human input
00:09:34.400 | versus valuable AI output.
00:09:37.100 | And you can sort of make a mental model of this
00:09:39.400 | and track the ratio of this, and that's more interesting
00:09:41.700 | than arguing about definitions of workflow versus agents.
00:09:44.900 | So, for example, in the co-pilot era,
00:09:47.000 | you had sort of, like, a debounce input of, like,
00:09:50.000 | every few characters that you type that maybe you'll do in autocomplete.
00:09:52.800 | In ChatGPT, every few queries that you type,
00:09:55.300 | you'll maybe output responding query.
00:09:58.100 | It starts to get more interesting with the reasoning models
00:10:00.300 | with, like, a 1 to 10 ratio.
00:10:01.900 | And then, obviously, with, like, the new agents,
00:10:04.100 | now it's, like, more sort of deep research, Notebook LM.
00:10:06.600 | By the way, Ryza Martin, also speaking on the product management
00:10:09.800 | track, she's incredible on talking about the story of Notebook LM.
00:10:15.600 | The other really interesting angle,
00:10:17.400 | if you want to take this mental model to stretch it,
00:10:20.100 | is the 0 to 1, the ambient agents.
00:10:22.000 | With no human input, what kind of interesting AI output can you get?
00:10:26.200 | So, to me, that's more a useful discussion
00:10:29.000 | about input versus output than what is a workflow?
00:10:31.200 | What is an agent?
00:10:31.800 | How agentic is your thing versus not?
00:10:35.800 | Talking about AI news, so, you know,
00:10:37.800 | it is like a bunch of scripts in a trench code.
00:10:42.000 | And I realized I've written it three times.
00:10:43.600 | I've written it for the Discord scrape.
00:10:45.600 | I've written it for the Reddit scrape.
00:10:46.700 | I've written it for the Twitter scrape.
00:10:48.100 | And basically, it's just-- it's always the same process.
00:10:50.100 | You scrape it, you plan, you recursively summarize,
00:10:52.700 | you format, and you evaluate.
00:10:54.300 | And yeah, that's the three kids in a trench coat.
00:10:58.300 | And that's really what it is.
00:10:59.800 | I run it every day, and we improve it a little bit,
00:11:01.900 | but then I'm also running this conference.
00:11:04.300 | So if you generalize it, that actually starts to become
00:11:07.300 | an interesting model for building AI-intensive applications,
00:11:11.300 | where you start to make thousands of AI calls
00:11:14.400 | to serve a particular purpose.
00:11:17.100 | So you sync, you plan, and you sort of parallel process.
00:11:20.600 | You analyze and sort of reduce that down to--
00:11:23.400 | from many to one.
00:11:24.900 | And then you deliver the contents to the user,
00:11:29.300 | and then you evaluate.
00:11:30.400 | And to me, like, that conveniently forms an acronym,
00:11:32.900 | S-P-A-D-E, which is really nice.
00:11:36.500 | There's also sort of interesting AI engineering elements
00:11:39.100 | that fit in there.
00:11:40.300 | So you can process all these into a knowledge graph.
00:11:42.600 | You can turn these into, like, structured outputs.
00:11:45.800 | And you can generate code as well.
00:11:47.200 | So for example, you know, ChatGPT with Canvas,
00:11:50.800 | or Claude with artifacts, is a way of just delivering the output
00:11:55.300 | as a code artifact instead of just text output.
00:11:57.900 | And I think it's like a really interesting way
00:11:59.500 | to think about this.
00:12:00.100 | So this is my mental model so far.
00:12:02.600 | I wish I had the space to go into it.
00:12:04.200 | But ask me later.
00:12:05.200 | This is what I'm developing right now.
00:12:06.500 | But I think what I would really emphasize is, you know,
00:12:09.900 | I think, like, there's all sorts of interesting ways
00:12:11.800 | to think about what the standard model is,
00:12:14.100 | and whether it's useful for you in taking your application
00:12:17.600 | to the next step of, like, how do I add more intelligence
00:12:20.300 | to this in a way that's useful and not annoying?
00:12:22.800 | And for me, this is it.
00:12:24.500 | OK, so I've thrown a bunch of standard models in here.
00:12:27.700 | But that's just my current hypothesis.
00:12:29.800 | I want you at this conference, in all your conversations
00:12:32.300 | with each other and with the speakers,
00:12:33.800 | to think about what the new standard model for AI
00:12:35.800 | engineering is.
00:12:36.600 | What can everyone use to improve their applications?
00:12:39.100 | And I guess, ultimately, build products
00:12:41.300 | that people want to use, which is what Laurie mentioned
00:12:44.100 | at the start.
00:12:44.600 | So I'm really excited about this conference.
00:12:47.500 | It's such an honor and a joy to get it together for you guys.
00:12:51.200 | And I hope you enjoy the rest of the conference.
00:12:52.900 | Thank you so much.
00:12:53.800 | Thank you so much.
00:12:55.900 | Thank you.
00:12:56.900 | Thank you.
00:12:56.900 | Thank you.
00:12:57.900 | Thank you.
00:12:58.900 | Thank you.