Back to Index

AX is the only Experience that Matters - Ivan Burazin, Daytona


Transcript

. The number of agents in the world, I believe, won't just match the number of humans, but they will basically be the number of humans to the power of n. And what is the power of n? I have really no idea. But we can see already today that we are using multiple of them, more are spinning up more.

And so it will be a very, very large number. And this isn't just hypothetical. I mean, we don't have a lot of data around this. I actually researched a lot and tried to find what we have. But some key elements that we do have is that the first one may be not that important, but 25% of YC startups say that AI writes 95% of their code.

But the second part is actually quite interesting, whereas 37% of the latest YC batch are actually building agents as their products. So they're not building co-pilots. They're not building autocomplete. They're not building legacy SaaS companies. Like, agents are the new product. And I think that's actually quite interesting and aligns with what we're looking at.

But the most uncomfortable truth for people building tools for agents, which we are and I know a lot of people are as well, is that most of the tools that are built today break the moment you remove a human from the loop. As we're not entering a world where agents are assisting developers, agents will definitely be the developers.

I mean, they will be the number one people using that. So basically, if you're not building dev tools-- if you're building dev tools for humans, you're basically building for the past. And what I think about when we build our things, and I think we all should do, is like, what does it mean to actually build tools for the future?

And I think it means building for agents. And the term for building for agents is agent experience. And a good friend of mine, Matt, which I guess a lot of you know here, who's the co-founder of Netlify, coined the term agent experience. And it's supposed to be or continues the evolution of, like, a user experience, a customer experience, and mostly a developer experience.

But the definition that I sort of found that works the best or defines it the best was the one from Sean that works at Netlify, so not Matt necessarily, which says, how easily can agents access? How easily can they understand? And how easily can they operate within digital environments to achieve the goal that the user defined?

And I think that's a really good definition. And looking at this, I wanted to see, like, who is actually implementing agent experience? Who is looking at this definition? Is who is building their tools like this? And I found some companies-- there's more than the ones I put in here, of course.

So like, if you guys are building something like this, and I didn't put you, I'm very, very sorry. Like, I just wanted to put a few that I've talked to and looked at. And so I think seamless authentication is a really, really important one. And if your agent-- if you give an agent a task to do and it has to log in, for the most part, it will break.

And moreover, you don't want to give your passwords to the agent. That is definitely not a good idea. And so a company like Arcade, which randomly, if he's around-- I met the founder yesterday. I had not met him before that he was ready to talk. What they solve is if your agent tries to authenticate into, like, Delta or Booking.com or wherever it is, it can fall back to the user, the user can log in, and the agent can go off to its job.

Moreover, you can also add credentials inside of Arcade. Before that, your agent can go and do what it means to be done. So authenticate once, agent goes off, does their job. A second one is-- this is the easiest one, which I'm sure pretty much all of you do here-- is agent-readable docs.

And Stripe does a great job of this. Basically, any doc URL, just append a .md, and you'll get a clean markdown file, no fluff. The agent can consume it as easy as-- very, very easily. But moreover, the llms.txt standard-- I'm sure all of you have seen it. It's actually on the website itself.

You can get the conference website. So if you don't have an llms.txt in your docs, you definitely should. It makes it super easy for the agent to go and read what you all are doing. The third thing we found was API-first design. And I think this is actually the most critical.

Basically, if an agent can't see the functionality, it's really hard for it to do. And I strongly believe that the best way for an agent to interact with any tool is through a machine-native interface. And I put API here, and I won't talk about MCP. I'm sure a lot of people talk about MCP.

Will MCP be here or not here? I don't know. But basically, an API is the underlying tech that you have to have, have to be exposed, so the agent can access what they want the most efficiently. And so Neon and Etnify and Supabase and these companies do a really good job of this.

As far as I know, all the key things that an agent needs to use their services can be accessed via an API. And now, all these things are really great, and I think this is all the right direction. But is there more? Is that it? Is it just an API?

Is it just, like, make your docs more readable? And so what I wanted to think about is the first definition of Sean's. And I think there's missing one key part for all of us who are building tools to think about. And that's the word, actually, autonomously. So how easily can an agent autonomously access, autonomously understand, autonomously operate within this environment?

I think this word is actually quite important, and it makes you sort of think about what you're doing in a different way. Because if you give your agent a task, and it can do a lot, but it always has to fall back to you to achieve its task, I don't think that is the future.

And I think we should all go back to the drawing board and think about this. Basically, what happens if there are no humans around to click the buttons, to debug errors, to whatever it may be? I think that's where our work starts as tool builders. And so if you're not actually solving that, then I actually think you are just porting for the past.

And just as a note, Swix basically coerced me into doing this talk, because he said that agent experience doesn't exist, it's just a wrap-around developer experience. And so I think if you're not thinking about how an agent can autonomously do his task, then you basically are doing that. And I don't think everyone's doing that.

And just really quickly, I sort of skipped off who I am and why the hell I'm talking about these things, and why I might know something, might not. Well, we'll see at the end of it. So first company in the early 2000s, so M-dated, very old person, started by building data centers and server rooms, HP servers, IBMs, Hypervisor, VMware, all these things.

So actually screwing in servers and running cables and installing Windows servers via CDs. Yeah. Exactly. Some people remember CDs. After that, sort of sold that, created the very first browser-based ID in 2009. So Replit, like whatever, 15 years ago. That was super early, so we had to create our own IDs, our own orchestrators, our own isolation.

More recently, led the developer experience at a company called Infobip. It's a multi-billion-dollar company that competes with Twilio, which you haven't really heard of probably. But basically, it's a communications platform as a service. So one API for sending emails, SMS, voice, and all those nice things. And as of late, working at Daytona with a bunch of people that I've worked with in my older companies.

And I have to say, if anyone is a founder or founding a company, if you can work with people that you've worked with historically, you should do that. It makes it so much more fun. So yeah. Daytona basically is a secure and elastic infrastructure purposely built for running AI-generated code.

Basically that means we created an agent-native runtime. What agent-native means, I'll hopefully explain a bit. Or more simply, like sandboxes. So we, as Daytona, give agents a computing environment that they can use to run code, do data analysis, reinforcement learning, computer use, or more recently, I've seen people use it for agents to play video games, like Counter-Strike and whatnot.

So people are doing funny things with these things. And so Daytona is basically what a laptop is to a human. That is what a Daytona runtime is for an agent, sort of. And so when we were starting to build this new company or this new product, we looked at what it means to build something for agents.

And we took the principles of what agent experience is. And these are the first things that sort of we built through that. And so one is speed. Basically, if you have a tool for an agent and your agent is in interactive mode, so think of like Claude or ChatGPT, if you're the user, you don't want to waste time for tools to turn on.

So we created something that spins up in like 27 milliseconds. Obviously, API first, so can an agent be an API, turn on a machine, turn it off, you know, clone it, delete it, whatnot. After that, we thought about what more things can it do? So like, it's really fast, the agent can spin it up, but what happens when it gets inside?

And so we thought it wasn't ideal for an agent to parse an output from a terminal. So we preloaded all of them with headless tools. So a file explorer, get clients, LSP, terminal, and all these things. And so this sort of aligns with the original definition is like we're helping agents do things faster.

And we thought like that was it until we started getting like really interesting users and customers that said like, oh shit, like our agent can't do this. And so we have to put in a human in that. And so I'm gonna walk you through some of these new primitives, maybe a bigger word for that, maybe not, or features that we found with new customers building agents on top of us.

And these, just to be clear, I don't think these features are something that everyone can replicate, but it's more of like how we thought about solving these problems and how we've seen these problems and maybe inspires you to think about how you're building your tools for agents. So the first thing is a declarative image builder.

So Daytona is a sandbox that uses any Docker image off the shelf. So as a template, then the agent can, you know, spin up a sandbox with that template. Obviously, if an agent needs to add a dependency, it can use, you know, the API terminal and just like pip install whatever it wants, great.

But if an agent has to now install like 20 new things and do it over and over again, that's just like a waste of time and resources. And the way you would solve that right now is like a human goes in to Daytona, you know, creates the new Docker container, or goes to the laptop, creates a new Docker container, pushes to our container registry, and the agent can do that.

But that like takes time and effort. The other option is like an agent tries to build a Docker image on its own and then pushes the container registry, which is brittle, takes time, breaks, and probably won't do very well. So how do you think about that? Well, we created something like a declarative image builder where an agent can say, Oh, I need a net new one.

This is my base image. These are things I want installed. These run these commands. We build it on the fly and open up a sandbox. Basically, the agent can now at any moment in time say, I need something net new. I can make it on my own, and I can launch it on my own.

Solves it end to end for itself. The second thing is what we humans take for granted is if you're programming on your laptop, your environments are usually probably Docker containers on your machine. And if you have this large data set, you can really easily share it with all the environments that you have on your machine.

But if an agent, and we found some users that really need this, they need like 100 gigabyte data sets in every machine that they have, there is no local laptop context. Every environment is isolated on its own. So how do you make it more efficient that an agent doesn't have to download something from or upload from S3 every single goddamn time?

And so basically, we created something called Daytona Volumes. The agent can invoke one of these, any size it wants, uploads it once, and maps it as a network drive, or mounts it as a network drive on every single machine. The last thing I'm just going to go through, and then we'll finish off on these feature stuff, is quite different from the others.

But I put it in here because I feel it's very much unique to agents. And that is the ability to execute things in parallel. So an agent, unlike a human, we as humans can work on one machine, maybe two if we're super dialed in. An agent can try things in parallel all at once.

So instead of going, try this outcome, it sucks, go back, try it again, it sucks, go back. It can basically take a machine, fork it five times, 10 times, 100,000 times, go through all of these, and then come to an output. And so again, these are just things that we have seen as we've worked with users, and that they need their agents, what their agents need to get their job done.

And so if you ask, like, what else is there in agent experience? The short answer is, like, I have no idea. The reason I say that is because people that are building agents are building them right now. And you're only now, or we are only now, finding out what they actually need to get the task done.

And so what I want to instill you all with is, if you haven't thought about, or if you have a tool that at some point needs a human in the loop, then you probably haven't solved or thought about how to solve this. So the beginning of the talk is called, basically, the agent experience is only experience that matters.

And I say this not because we as humans are disappearing, but agents by far will be the largest user out there. And less and less will have people behind the screen reading the logs, clicking on buttons, typing into terminals. And lastly, just one thing I'll leave you with is, when we started Daytona, basically, we wanted the best developer experience that we could have.

And we invested a lot in, like, the terminal. And there was a talk earlier about here today about someone that has amazing, you know, terminal UI. And I think that's great. We started with that as well. But as a small company, you sort of focus on what you need, and on what's most important.

And right now, what's most important is, yeah, great to have a CLI. But can your agent do your task end to end? Because basically, if your agent can't use your product in the future, absolutely no one will. So with that, I thank you for your time. We, if anyone has any interest in this, you can take a look here, use our GitHub, it's open source.

Also, we have a booth downstairs in Expo Hall. Thank you so much. you you