back to index

AX is the only Experience that Matters - Ivan Burazin, Daytona


Whisper Transcript | Transcript Only Page

00:00:15.440 | The number of agents in the world, I believe,
00:00:18.360 | won't just match the number of humans,
00:00:20.120 | but they will basically be the number of humans
00:00:22.640 | to the power of n.
00:00:23.960 | And what is the power of n?
00:00:25.160 | I have really no idea.
00:00:27.160 | But we can see already today that we
00:00:29.320 | are using multiple of them, more are spinning up more.
00:00:32.080 | And so it will be a very, very large number.
00:00:35.860 | And this isn't just hypothetical.
00:00:38.920 | I mean, we don't have a lot of data around this.
00:00:41.000 | I actually researched a lot and tried to find what we have.
00:00:43.960 | But some key elements that we do have
00:00:46.480 | is that the first one may be not that important,
00:00:49.300 | but 25% of YC startups say that AI writes 95% of their code.
00:00:54.340 | But the second part is actually quite interesting,
00:00:56.460 | whereas 37% of the latest YC batch are actually building agents
00:01:01.000 | as their products.
00:01:02.000 | So they're not building co-pilots.
00:01:03.000 | They're not building autocomplete.
00:01:04.000 | They're not building legacy SaaS companies.
00:01:06.960 | Like, agents are the new product.
00:01:09.940 | And I think that's actually quite interesting and aligns
00:01:12.600 | with what we're looking at.
00:01:14.000 | But the most uncomfortable truth for people building tools
00:01:17.600 | for agents, which we are and I know a lot of people are as well,
00:01:20.780 | is that most of the tools that are built today
00:01:25.000 | break the moment you remove a human from the loop.
00:01:30.280 | As we're not entering a world where agents are assisting
00:01:33.420 | developers, agents will definitely be the developers.
00:01:36.320 | I mean, they will be the number one people using that.
00:01:39.320 | So basically, if you're not building dev tools--
00:01:47.860 | if you're building dev tools for humans,
00:01:49.240 | you're basically building for the past.
00:01:51.300 | And what I think about when we build our things,
00:01:54.500 | and I think we all should do, is like,
00:01:56.360 | what does it mean to actually build tools for the future?
00:01:59.720 | And I think it means building for agents.
00:02:03.620 | And the term for building for agents is agent experience.
00:02:07.680 | And a good friend of mine, Matt, which I guess a lot of you
00:02:09.800 | know here, who's the co-founder of Netlify,
00:02:12.520 | coined the term agent experience.
00:02:14.420 | And it's supposed to be or continues
00:02:17.260 | the evolution of, like, a user experience, a customer
00:02:19.160 | experience, and mostly a developer experience.
00:02:23.060 | But the definition that I sort of found that works the best
00:02:26.760 | or defines it the best was the one from Sean
00:02:29.200 | that works at Netlify, so not Matt necessarily,
00:02:32.560 | which says, how easily can agents access?
00:02:36.760 | How easily can they understand?
00:02:38.380 | And how easily can they operate within digital environments
00:02:41.440 | to achieve the goal that the user defined?
00:02:45.460 | And I think that's a really good definition.
00:02:49.000 | And looking at this, I wanted to see, like,
00:02:51.560 | who is actually implementing agent experience?
00:02:53.680 | Who is looking at this definition?
00:02:54.920 | Is who is building their tools like this?
00:02:57.160 | And I found some companies--
00:02:58.680 | there's more than the ones I put in here, of course.
00:03:00.540 | So like, if you guys are building something like this,
00:03:02.040 | and I didn't put you, I'm very, very sorry.
00:03:03.740 | Like, I just wanted to put a few that I've talked to and looked at.
00:03:07.860 | And so I think seamless authentication is a really, really important one.
00:03:13.520 | And if your agent-- if you give an agent a task to do and it has to log in, for the most part,
00:03:18.580 | it will break.
00:03:19.760 | And moreover, you don't want to give your passwords to the agent.
00:03:22.020 | That is definitely not a good idea.
00:03:24.700 | And so a company like Arcade, which randomly, if he's around-- I met the founder yesterday.
00:03:29.760 | I had not met him before that he was ready to talk.
00:03:33.900 | What they solve is if your agent tries to authenticate into, like, Delta or Booking.com or wherever
00:03:39.480 | it is, it can fall back to the user, the user can log in, and the agent can go off to its job.
00:03:45.620 | Moreover, you can also add credentials inside of Arcade.
00:03:49.480 | Before that, your agent can go and do what it means to be done.
00:03:53.380 | So authenticate once, agent goes off, does their job.
00:03:57.540 | A second one is-- this is the easiest one, which I'm sure pretty much all of you do here--
00:04:02.620 | is agent-readable docs.
00:04:04.300 | And Stripe does a great job of this.
00:04:05.860 | Basically, any doc URL, just append a .md, and you'll get a clean markdown file, no fluff.
00:04:12.680 | The agent can consume it as easy as-- very, very easily.
00:04:15.620 | But moreover, the llms.txt standard-- I'm sure all of you have seen it.
00:04:20.280 | It's actually on the website itself.
00:04:21.460 | You can get the conference website.
00:04:23.700 | So if you don't have an llms.txt in your docs, you definitely should.
00:04:28.080 | It makes it super easy for the agent to go and read what you all are doing.
00:04:34.980 | The third thing we found was API-first design.
00:04:37.700 | And I think this is actually the most critical.
00:04:39.620 | Basically, if an agent can't see the functionality, it's really hard for it to do.
00:04:44.640 | And I strongly believe that the best way for an agent to interact with any tool is through
00:04:50.160 | a machine-native interface.
00:04:52.340 | And I put API here, and I won't talk about MCP.
00:04:55.500 | I'm sure a lot of people talk about MCP.
00:04:57.460 | Will MCP be here or not here?
00:04:59.060 | I don't know.
00:05:00.400 | But basically, an API is the underlying tech that you have to have, have to be exposed,
00:05:05.240 | so the agent can access what they want the most efficiently.
00:05:11.100 | And so Neon and Etnify and Supabase and these companies do a really good job of this.
00:05:16.460 | As far as I know, all the key things that an agent needs to use their services can be accessed
00:05:21.200 | via an API.
00:05:23.580 | And now, all these things are really great, and I think this is all the right direction.
00:05:28.120 | But is there more?
00:05:29.240 | Is that it?
00:05:30.240 | Is it just an API?
00:05:31.240 | Is it just, like, make your docs more readable?
00:05:35.220 | And so what I wanted to think about is the first definition of Sean's.
00:05:38.300 | And I think there's missing one key part for all of us who are building tools to think about.
00:05:43.040 | And that's the word, actually, autonomously.
00:05:46.340 | So how easily can an agent autonomously access, autonomously understand, autonomously operate
00:05:51.040 | within this environment?
00:05:52.500 | I think this word is actually quite important, and it makes you sort of think about what you're
00:05:57.240 | doing in a different way.
00:05:59.960 | Because if you give your agent a task, and it can do a lot, but it always has to fall back
00:06:05.780 | to you to achieve its task, I don't think that is the future.
00:06:09.900 | And I think we should all go back to the drawing board and think about this.
00:06:14.280 | Basically, what happens if there are no humans around to click the buttons, to debug errors,
00:06:19.940 | to whatever it may be?
00:06:21.060 | I think that's where our work starts as tool builders.
00:06:26.340 | And so if you're not actually solving that, then I actually think you are just porting for
00:06:30.080 | the past.
00:06:31.080 | And just as a note, Swix basically coerced me into doing this talk, because he said that
00:06:39.100 | agent experience doesn't exist, it's just a wrap-around developer experience.
00:06:42.540 | And so I think if you're not thinking about how an agent can autonomously do his task, then
00:06:47.780 | you basically are doing that.
00:06:48.980 | And I don't think everyone's doing that.
00:06:51.220 | And just really quickly, I sort of skipped off who I am and why the hell I'm talking about
00:06:54.480 | these things, and why I might know something, might not.
00:06:57.480 | Well, we'll see at the end of it.
00:06:59.480 | So first company in the early 2000s, so M-dated, very old person, started by building data centers
00:07:06.620 | and server rooms, HP servers, IBMs, Hypervisor, VMware, all these things.
00:07:11.160 | So actually screwing in servers and running cables and installing Windows servers via CDs.
00:07:16.920 | Yeah.
00:07:17.920 | Exactly.
00:07:18.920 | Some people remember CDs.
00:07:21.360 | After that, sort of sold that, created the very first browser-based ID in 2009.
00:07:25.600 | So Replit, like whatever, 15 years ago.
00:07:28.600 | That was super early, so we had to create our own IDs, our own orchestrators, our own isolation.
00:07:34.360 | More recently, led the developer experience at a company called Infobip.
00:07:39.920 | It's a multi-billion-dollar company that competes with Twilio, which you haven't really heard
00:07:44.040 | of probably.
00:07:45.040 | But basically, it's a communications platform as a service.
00:07:47.540 | So one API for sending emails, SMS, voice, and all those nice things.
00:07:52.960 | And as of late, working at Daytona with a bunch of people that I've worked with in my
00:07:58.160 | older companies.
00:07:59.360 | And I have to say, if anyone is a founder or founding a company, if you can work with
00:08:04.480 | people that you've worked with historically, you should do that.
00:08:08.000 | It makes it so much more fun.
00:08:10.860 | So yeah.
00:08:11.860 | Daytona basically is a secure and elastic infrastructure purposely built for running AI-generated code.
00:08:18.580 | Basically that means we created an agent-native runtime.
00:08:21.440 | What agent-native means, I'll hopefully explain a bit.
00:08:25.000 | Or more simply, like sandboxes.
00:08:27.840 | So we, as Daytona, give agents a computing environment that they can use to run code,
00:08:33.400 | do data analysis, reinforcement learning, computer use, or more recently, I've seen people use
00:08:38.560 | it for agents to play video games, like Counter-Strike and whatnot.
00:08:43.020 | So people are doing funny things with these things.
00:08:46.580 | And so Daytona is basically what a laptop is to a human.
00:08:49.820 | That is what a Daytona runtime is for an agent, sort of.
00:08:54.300 | And so when we were starting to build this new company or this new product, we looked at what
00:08:59.540 | it means to build something for agents.
00:09:00.860 | And we took the principles of what agent experience is.
00:09:03.920 | And these are the first things that sort of we built through that.
00:09:07.060 | And so one is speed.
00:09:08.860 | Basically, if you have a tool for an agent and your agent is in interactive mode, so think
00:09:14.420 | of like Claude or ChatGPT, if you're the user, you don't want to waste time for tools to turn
00:09:20.580 | So we created something that spins up in like 27 milliseconds.
00:09:23.420 | Obviously, API first, so can an agent be an API, turn on a machine, turn it off, you know, clone
00:09:28.980 | it, delete it, whatnot.
00:09:31.660 | After that, we thought about what more things can it do?
00:09:33.760 | So like, it's really fast, the agent can spin it up, but what happens when it gets inside?
00:09:38.100 | And so we thought it wasn't ideal for an agent to parse an output from a terminal.
00:09:42.080 | So we preloaded all of them with headless tools.
00:09:45.080 | So a file explorer, get clients, LSP, terminal, and all these things.
00:09:49.280 | And so this sort of aligns with the original definition is like we're helping agents do
00:09:52.820 | things faster.
00:09:54.660 | And we thought like that was it until we started getting like really interesting users and customers
00:10:01.080 | that said like, oh shit, like our agent can't do this.
00:10:04.600 | And so we have to put in a human in that.
00:10:07.220 | And so I'm gonna walk you through some of these new primitives, maybe a bigger word for that,
00:10:13.080 | maybe not, or features that we found with new customers building agents on top of us.
00:10:20.460 | And these, just to be clear, I don't think these features are something that everyone can replicate,
00:10:24.580 | but it's more of like how we thought about solving these problems and how we've seen these problems
00:10:29.020 | and maybe inspires you to think about how you're building your tools for agents.
00:10:34.880 | So the first thing is a declarative image builder.
00:10:39.000 | So Daytona is a sandbox that uses any Docker image off the shelf.
00:10:43.640 | So as a template, then the agent can, you know, spin up a sandbox with that template.
00:10:48.380 | Obviously, if an agent needs to add a dependency, it can use, you know, the API terminal and just
00:10:55.400 | like pip install whatever it wants, great.
00:10:57.800 | But if an agent has to now install like 20 new things and do it over and over again, that's
00:11:01.600 | just like a waste of time and resources. And the way you would solve that right now is like a human
00:11:06.320 | goes in to Daytona, you know, creates the new Docker container, or goes to the laptop, creates a new Docker container,
00:11:11.060 | pushes to our container registry, and the agent can do that. But that like takes time and effort.
00:11:15.440 | The other option is like an agent tries to build a Docker image on its own and then pushes the container
00:11:20.720 | registry, which is brittle, takes time, breaks, and probably won't do very well. So how do you think about that?
00:11:28.240 | Well, we created something like a declarative image builder where an agent can say,
00:11:31.520 | Oh, I need a net new one. This is my base image. These are things I want installed. These run these
00:11:37.440 | commands. We build it on the fly and open up a sandbox. Basically, the agent can now at any moment
00:11:43.360 | in time say, I need something net new. I can make it on my own, and I can launch it on my own. Solves it
00:11:50.000 | end to end for itself. The second thing is what we humans take for granted is if you're programming on your
00:11:57.840 | laptop, your environments are usually probably Docker containers on your machine. And if you have this
00:12:02.960 | large data set, you can really easily share it with all the environments that you have on your machine.
00:12:09.120 | But if an agent, and we found some users that really need this, they need like 100 gigabyte data sets in
00:12:13.760 | every machine that they have, there is no local laptop context. Every environment is isolated on its own.
00:12:21.920 | So how do you make it more efficient that an agent doesn't have to download something from or upload
00:12:27.360 | from S3 every single goddamn time? And so basically, we created something called Daytona Volumes.
00:12:33.120 | The agent can invoke one of these, any size it wants, uploads it once, and maps it as a network drive,
00:12:40.960 | or mounts it as a network drive on every single machine. The last thing I'm just going to go through,
00:12:47.440 | and then we'll finish off on these feature stuff, is quite different from the others. But I put it in
00:12:52.160 | here because I feel it's very much unique to agents. And that is the ability to execute things in parallel.
00:12:59.520 | So an agent, unlike a human, we as humans can work on one machine, maybe two if we're super dialed in.
00:13:04.640 | An agent can try things in parallel all at once. So instead of going, try this outcome, it sucks, go back,
00:13:16.000 | try it again, it sucks, go back. It can basically take a machine, fork it five times, 10 times, 100,000 times,
00:13:22.240 | go through all of these, and then come to an output. And so again,
00:13:28.000 | these are just things that we have seen as we've worked with users, and that they need their agents,
00:13:34.560 | what their agents need to get their job done. And so if you ask, like, what else is there in
00:13:39.520 | agent experience? The short answer is, like, I have no idea. The reason I say that is because
00:13:47.840 | people that are building agents are building them right now. And you're only now, or we are only now,
00:13:54.560 | finding out what they actually need to get the task done. And so what I want to instill you all with
00:14:01.120 | is, if you haven't thought about, or if you have a tool that at some point needs a human in the loop,
00:14:07.920 | then you probably haven't solved or thought about how to solve this. So the beginning of the talk is
00:14:12.800 | called, basically, the agent experience is only experience that matters. And I say this not because
00:14:18.240 | we as humans are disappearing, but agents by far will be the largest user out there. And less and
00:14:23.920 | less will have people behind the screen reading the logs, clicking on buttons, typing into terminals.
00:14:29.120 | And lastly, just one thing I'll leave you with is, when we started Daytona, basically, we wanted the best
00:14:35.360 | developer experience that we could have. And we invested a lot in, like, the terminal. And there was a talk
00:14:40.240 | earlier about here today about someone that has amazing, you know, terminal UI. And I think that's great.
00:14:44.960 | We started with that as well. But as a small company, you sort of focus on what you need,
00:14:49.120 | and on what's most important. And right now, what's most important is, yeah, great to have a CLI.
00:14:55.360 | But can your agent do your task end to end? Because basically, if your agent can't use your product
00:15:01.120 | in the future, absolutely no one will. So with that, I thank you for your time. We, if anyone has any
00:15:10.720 | interest in this, you can take a look here, use our GitHub, it's open source. Also, we have a booth
00:15:14.960 | downstairs in Expo Hall. Thank you so much.