back to index[Full Workshop] How to add secure code interpreting in your AI app: Vasek Mlejnsky

00:00:27.160 |
and you can code along, how to add AI code interpreting 00:00:34.780 |
So what we will be building will be very similar 00:00:47.020 |
or Anthropics AI Artifacts app that they just released 00:00:56.120 |
the way it looks like is that you have a chat on the left, 00:01:04.280 |
And what this preview does is that Claude can write code. 00:01:10.500 |
In Claude's case, it's supported just for HTML, JS, 00:01:18.780 |
And then it actually renders and run the code right next to the chat. 00:01:24.020 |
We will build something very similar with a few caveats. 00:01:31.400 |
And first is that it-- we will be able to run AI-generated Python code, 00:01:41.140 |
I will also show you, in the end, how to run other-- 00:01:51.420 |
And we will be using E2B, which is a company that I co-funded, 00:01:58.340 |
and for secure code interpreting, and you can code along while we are building it. 00:02:06.420 |
So here's a quick demo of what we are going to build. 00:02:11.660 |
On the left, we will be chatting with SONET 3.5, and every time it generates code, 00:02:18.260 |
it's actually implemented through function calling. 00:02:21.820 |
We will run that code in a secure sandbox that can run pretty much anything 00:02:25.980 |
that a Linux machine or Ubuntu machine can run. 00:02:30.160 |
And you can spawn many, many of these sandboxes for secure code interpreting. 00:02:40.240 |
The workshop is made for code along, but of course you can just watch. 00:02:45.920 |
But if you are interested in actually following me, 00:02:48.920 |
you can go to our repository and clone the cookbook. 00:03:06.100 |
One of them is E2B API key, which you can get on our documentation. 00:03:14.780 |
So E2B, E2B.dev/docs, then you go to API key, you sign up, and we give you your API key. 00:03:29.460 |
In case you don't have that, or you don't want to get it, I created one for this workshop. 00:03:36.460 |
And so if you go to -- and you are free to -- well, I kind of trust you that you will not misuse it. 00:03:46.460 |
So if you go to dup.sh/e2b-workshop, you can get the API key for free. 00:03:56.940 |
So I know everyone had a lunch, so you are probably kind of tired. 00:04:11.640 |
So I will give you, like, a few quick minutes -- a few seconds, or tens of seconds to set 00:04:22.840 |
But just to get a better idea, how many of you want to follow along? 00:04:33.340 |
Oh, that's probably because, like, we have images there that you can generate with -- oh, 00:04:52.160 |
By the way, if anyone has any questions, just, like, feel free to ask during the workshop. 00:04:56.940 |
If anything isn't clear, I will repeat a question and -- and try to give you an answer. 00:05:05.940 |
Is the code interpreter running on our machines, or -- 00:05:24.940 |
So the question is, is the code interpreter running on your machine, or somewhere else? 00:05:29.940 |
The code interpreter isn't running on your machine. 00:05:32.940 |
It's using a tool called E2B, which I'm a co-founder of. 00:05:41.940 |
We're basically building an open source runtime for AI agents that you can -- and very soon 00:05:50.940 |
will be to self-host on AWS, GCP, Azure, and other big cloud providers. 00:06:00.940 |
What's the whole problem and challenge with code interpreting around security? 00:06:06.940 |
I was thinking that maybe first we implement the app so you can kind of get an idea of how 00:06:13.940 |
And then during Q&A, or after that, I can get into the security and what are the challenges 00:06:25.940 |
So how -- how's the -- how's the repo cloning look like? 00:06:47.940 |
Let me -- I probably can create very quickly just a repo without other examples. 00:06:58.940 |
And I will just put the example that we are interested in there, which should be much smaller. 00:07:22.940 |
So if you go -- if you go to GitHub/etub-dev/etub-cookbook, you go to examples. 00:07:39.940 |
And we will be building, like, an open source version of Anthropik's new Artifacts UI that 00:08:34.620 |
Let me just quickly set up the new repository. 00:08:56.940 |
If you go to this URL, I will show it on a big screen. 00:09:09.880 |
If you go to this URL, which is my personal GitHub, and go to /workshop-repo, you should 00:09:41.740 |
Can you -- you should be able to clone it now. 00:10:17.740 |
So if you want to clone -- if you want to code along, please go to this GitHub repository and clone it. 00:10:55.680 |
When -- after you clone it, just install dependencies, and you will also need those API -- yeah, API keys. 00:11:07.680 |
So if you create a file .env.local, which will need E2B API key and Anthropic API key. 00:11:24.680 |
So if you go to E2B.dev, click get started, and go to API key. 00:11:32.680 |
If you quickly sign up, you can get your API key. 00:11:36.680 |
If you just want to follow along -- yeah, if you just want to look, you don't need that API key. 00:11:47.680 |
And you will also have -- you will also need Anthropic's API key. 00:11:51.680 |
And in case you don't want to spend your money on that, I created -- I created an Anthropic API key just for this workshop. 00:12:02.680 |
So if you go to this link, it should open a one password share -- share model or share -- share website. 00:12:11.680 |
And you can get your Anthropic API key there. 00:12:20.680 |
So once you have your API key set up, install dependencies, and then just run npm run dev. 00:12:36.680 |
So what we are doing just now is we started an XJS app and -- which is like a starting point. 00:12:43.680 |
And we will now follow -- you can now follow and code along. 00:12:47.680 |
And I will show you how to add the code interpreting there. 00:13:01.680 |
It just shows an input, which is like -- this is a scaffold of the app where you can ask something. 00:13:06.680 |
So let's say we want to generate random chart. 00:13:10.680 |
And actually now it won't work because we need to implement everything. 00:13:13.680 |
So I just hit an enter, but I'm -- I should be getting -- or you should see in your terminal 405. 00:13:27.680 |
So now that we have everything set up -- in case I'm too fast, just like scream at me. 00:13:41.680 |
I want to just quickly show around the code, around the repository, and what we will be doing. 00:13:46.680 |
So there are two important parts of this project. 00:13:49.680 |
First is what we are going to use is Vercel's AI SDK. 00:13:56.680 |
And this in the JavaScript world is essentially easy and nice way how you can connect to different 00:14:04.680 |
models and stream them to your -- to your frontend from your backend. 00:14:11.680 |
So it's already installed once -- when you -- when you ran npm install. 00:14:17.680 |
In case you want to check out the docs after -- after the workshop or during, go to this link. 00:14:24.680 |
And the second thing that we will be using is the code interpreter SDK. 00:14:30.680 |
So that's -- that's something that we built at E2B. 00:14:34.680 |
And essentially what it does -- it's open source. 00:14:37.680 |
And what it does is that you can create your own custom code interpreter. 00:14:42.680 |
And that you can predefine the whole environment. 00:14:53.680 |
When you -- when you are using our cloud, it runs on our cloud. 00:14:59.680 |
And specifically for this SDK, inside the sandbox is a Jupyter server running. 00:15:09.680 |
To which you can just send Python, JavaScript, R code, and Java code. 00:15:18.680 |
And you get back standard output, error output. 00:15:25.680 |
So the reason we are using a Jupyter server there is because that's what we notice has been one of the most frequent use case from our users and our customers. 00:15:39.680 |
And generally, at least during the time of GPT-4 and for Turbo, the code that you usually got from the LLM was -- was for Python, like data science, Jupyter notebook. 00:15:58.680 |
And it's really great for visualization, which is like a big use case for what we are seeing with users. 00:16:10.680 |
This whole code interpreter SDK is built on like a general sandbox with a small VM. 00:16:15.680 |
We start those VMs pretty quickly, around a few hundred milliseconds. 00:16:19.680 |
Every time you create an instance of code interpreter, we actually create that sandbox. 00:16:24.680 |
So this is kind of like a wrapper around our sandbox SDK. 00:16:28.680 |
It's also open source, which is -- which is here. 00:16:33.680 |
And so if you want to build something pretty, pretty custom, you can do that. 00:16:36.680 |
So code interpreter is like a one specific use case of how you can use the sandbox. 00:16:41.680 |
You can look at the sandbox as a general Ubuntu machine that's made with security in mind for running like AI-generated code. 00:16:52.680 |
So if -- as you noticed, when we were running this app, when I wrote something, I just hit enter and nothing happened. 00:17:02.680 |
So first we need to -- we need to implement an endpoint to call the -- to call the VM -- sorry, to call the LLM. 00:17:14.680 |
And for that, we are using Vercel's serverless function. 00:17:18.680 |
And there's a predefined file called road.ts. 00:17:22.680 |
And here is commented out a function that will handle our post requests that we need to implement. 00:17:31.680 |
And here we will be using the Vercel's AI SDK. 00:17:40.680 |
And now we will do just like a simple, simple implementation of messaging with Claude. 00:17:48.680 |
First, we need to parse those messages coming from the request. 00:18:00.680 |
And then we actually use the AI SDK versus AI SDK. 00:18:11.680 |
And from that, we will be using the stream text method. 00:18:15.680 |
And so the model we are using is Anthropic, which is the sonnet one. 00:18:37.680 |
And so I already predefined a prompt in a file. 00:18:48.680 |
It's a simple prompt that tells the LLM sonnet that it's a skilled Python developer that can 00:19:11.680 |
That should be pretty much it for the basic implementation. 00:19:14.680 |
Now we just need to return stream from our post endpoint. 00:19:42.680 |
So if you are more curious, you can just jump into it there and enter your documentation. 00:19:59.680 |
First, our message is there and we get a response from cloud. 00:20:12.680 |
So the message is, so this is just the back end part. 00:20:26.680 |
The prompt is, sorry, I will show it here, in a, in a file dot prompt dot, in a file called 00:20:40.680 |
And it basically just says that it's a skilled Python developer that's capable of running, 00:20:53.680 |
It will work for different use cases as well. 00:20:57.680 |
Now I just wanted to show something where we at the end get a chart that you can render 00:21:08.680 |
One, one more thing that I wanted to mention is that in case you are not follow, you, you, 00:21:18.680 |
In the original repository, in the examples repo, there are branches called workshop one, workshop 00:21:27.680 |
two, workshop three, workshop five, four and five. 00:21:30.680 |
And those are like the stages that we are implementing here. 00:21:32.680 |
So in case you are wondering what is happening, it should be pretty much the same. 00:21:37.680 |
And if you just want to get to the final point, you can check out the workshop dash final. 00:22:09.680 |
So while we have the back end, main back end code on the left, I just also wanted to show you how it looks like on the front end really quickly. 00:22:31.680 |
It's a, it's a basic Next.js app, Next.js 13 app. 00:22:38.680 |
So there is a layout file that just defines a basic layout of the whole app. 00:22:48.680 |
And, but there's a one important or main important file, which is called page, which is our main page. 00:22:54.680 |
And there, we are using one hook from the versus AI SDK, which is called use chat. 00:23:01.680 |
And this is how we are streaming the messages from the server and sending the messages to the server. 00:23:09.680 |
So the use chat here, it's already implemented there, you don't need to write it. 00:23:14.680 |
It's hitting the endpoint API slash API slash chat, which is what we implemented on the left. 00:23:22.680 |
And it takes care of all the streaming and sending the messages on the, on the back end. 00:23:30.680 |
And we just then pass the messages to our chat component, which does those chat bubbles. 00:23:35.680 |
Now, now the, the goal was just to get a communication with cloud, with Sonnet. 00:23:56.680 |
So we'll do it through a function calling or tool usage. 00:23:59.680 |
If anyone needs it, there's a channel now for the workshop. 00:24:14.680 |
Can you, so how many of you have, have managed to get to this stage? 00:24:39.680 |
So the name of the Slack channel is workshop secure code in your AI, right? 00:24:49.680 |
So the repository we are working with is this repository. 00:25:09.680 |
I just created it quickly now because the original cookbook repository wasn't working. 00:25:25.680 |
And, yeah, that's a, that's an original repo I wanted to use, but it's too big and it 00:25:35.680 |
took too much time for everyone to download it. 00:25:38.680 |
But if you want to check out the final code for, for this, like, first step, you can go to, 00:25:47.680 |
And that should be, uh, the implementation, um, of this step pretty much. 00:26:14.680 |
So, uh, now we just are able to chat, uh, with, uh, Sonnet. 00:26:20.680 |
There really isn't any, uh, code execution going on. 00:26:23.680 |
And the way we get to the code execution is, uh, in this case, through tool usage. 00:26:31.680 |
Uh, the easiest out of the box, uh, is, uh, create a tool called something like run python, 00:26:41.680 |
And, uh, then inside the tool we will actually implement, uh, secure code execution inside 00:26:49.680 |
Uh, you could have other tools like run javascript or run, uh, bash command, create a file inside 00:26:57.680 |
Or you could do it in a completely different way. 00:27:00.680 |
So, we have seen users, uh, just ditching tool usage because it was paying for the streaming. 00:27:06.680 |
And, uh, they didn't like how you manage the whole tools. 00:27:09.680 |
And just, they just asked the model to, uh, define, uh, sorry, to, to generate everything 00:27:17.680 |
So, it's really depending on your use case and what you are looking for. 00:27:20.680 |
But, like, for this demo tool is the, uh, easiest one. 00:27:23.680 |
So, uh, now we are getting to the, uh, part two. 00:27:29.680 |
Uh, so if you want to eventually get the final code, you can go to workshop dash, uh, two. 00:27:34.680 |
Uh, and first thing that we want to define is a new, uh, new, new tool on our, uh, uh, on 00:27:48.680 |
So, uh, for that we have a helpful helper method called tool. 00:28:04.680 |
Uh, and there we will use the tool method from versus AR SDK. 00:28:11.680 |
And just describe, uh, what it should do to the LLM. 00:28:17.680 |
So, what, in JavaScript world, uh, usually you would be using something like Zot for that. 00:28:27.680 |
Uh, in Python you might be using, you might know instructor or something like that. 00:28:32.680 |
Uh, so first we will provide, like, a simple description, uh, that runs Python code. 00:28:40.680 |
And then, uh, we, uh, want to define the parameters for the tool. 00:28:47.680 |
And we really have one main parameter, uh, parameter, which is code. 00:28:53.680 |
Uh, and for that we will be, now we'll be using, oh, sorry. 00:29:06.680 |
Uh, we are basically describing the schema, uh, to the, to the LLM and forcing it to always 00:29:14.680 |
generate, uh, a return this type of structure. 00:29:22.680 |
First we defined, uh, code, which should be string. 00:29:26.680 |
And we can also pass description, uh, which is the, the code to run. 00:29:32.680 |
Uh, this is the main, uh, thing we want to pass. 00:29:36.680 |
Uh, and just for, oh, I'm missing a column here. 00:29:41.680 |
Just for the better UI that we can then build, uh, we are, we will also pass two more, uh, 00:29:48.680 |
uh, parameters or ask LLM to pass or give us two more parameters. 00:29:52.680 |
First is title, which is like a, uh, short title of, uh, uh, uh, that describes the code. 00:30:04.680 |
And we just wanted to use, uh, show it on the front end. 00:30:11.680 |
Which is pretty much similar, but, uh, just a little bit longer. 00:30:20.680 |
Uh, of, uh, description of what the, what the code, uh, what the code does. 00:30:26.680 |
So just by doing this, uh, we are telling, or versus, versus AR SDK is doing it for us. 00:30:33.680 |
Uh, it's, it's telling the LLM that it has a tool, uh, called run Python that it can call. 00:30:47.680 |
Or let me ask differently, is something not clear? 00:31:13.680 |
So that's a, that's an SDK that we are using, uh, it's literally called just AI SDK. 00:31:20.680 |
Uh, you can learn more about it if you go to, uh, sdk.vercel.ai/docs. 00:31:26.680 |
And it's basically, uh, like JavaScript, uh, very easy way to stream output from LLMs and 00:31:43.680 |
And especially if you are building like Next.js apps. 00:31:52.680 |
So inside that tool that we are using, run Python, we need to implement one more function, 00:32:02.680 |
Uh, and this function is then automatically called when the LLM decides to call our run Python 00:32:13.680 |
And the parameters, uh, that we just defined here, will get passed to the, uh, to the execute, 00:32:24.680 |
So what we will get here, mainly what we are, what we really care about here, uh, or 00:32:30.680 |
really the only thing we care about is the code. 00:32:33.680 |
And granted like nothing is going to really happen, uh, but, uh, in, inside this body of 00:32:42.680 |
this function, we will be calling the code interpreter SDK that I showed you previously. 00:32:48.680 |
Uh, and the code interpreter sandbox where we execute the code and we get the results from 00:33:11.680 |
So, uh, there's a new, one more new concept that we are going to introduce and which is the 00:33:18.680 |
Uh, we have a predefined file called sandbox. 00:33:30.680 |
Uh, you will see there's a import, uh, that we are using, which is code interpreter. 00:33:36.680 |
Uh, and we are importing code interpreter from the code interpreter SDK, uh, which is this SDK 00:33:48.680 |
So what this SDK does, um, is that it will run AI generated code, in this case Python code, 00:33:56.680 |
uh, in a secure environment, uh, and it will, uh, it will be actually running inside the sandbox, 00:34:05.680 |
Uh, soon it will be able to, uh, run it on your, uh, on your own, uh, cloud. 00:34:11.680 |
Uh, we have a, we have two methods here that we need to implement. 00:34:16.680 |
Uh, first is, um, that we need to acquire the sandbox or we need to create a sandbox. 00:34:24.680 |
And once we have a sandbox, uh, we then, uh, can run the Python code inside the sandbox, 00:34:33.680 |
And there we have predefined methods on the code interpreter sandbox for that. 00:34:42.680 |
Um, so, uh, after, once, once we implement this, I can go into more details about the security 00:34:49.680 |
and, uh, problems with actually running it on anything else but Linux machine. 00:34:55.680 |
Uh, but, uh, yeah, uh, if, if you have any questions, like, feel free to ask after that. 00:35:22.680 |
Oh, it's just a, it's a, it's a, it's a, oh, I see, I see the confusion. 00:35:27.680 |
Uh, we are not referring to the, to the, you mean this function. 00:35:34.680 |
Uh, so, on the line 27, where my cursor is, it's a key of the object. 00:35:41.680 |
Uh, we will be very soon calling this run Python function that we are importing from the 00:35:54.680 |
So, so the question was, uh, it was confusion about the naming. 00:35:55.680 |
So here we have, uh, a key called, uh, or field called run Python. 00:36:10.680 |
And there, we also are importing function called run Python on the line 15 here. 00:36:15.680 |
Uh, and, uh, it's just a, it's not, the line 27 is not actually referring to, uh, function 00:36:35.680 |
So the, uh, definition of the function, or the, the, um, the function is defined on the, 00:36:42.680 |
inside the file sandbox that we have on the right. 00:36:47.680 |
And this is where we will be using the code interpreter SDK. 00:36:57.680 |
So, uh, first we need to, first we need to actually get the sandbox. 00:37:05.680 |
And for that, uh, we can, uh, do it very, very quickly, uh, simply, uh, and that would 00:37:18.680 |
Uh, so we import code interpreter here on line seven from the code interpreter SDK. 00:37:28.680 |
And we get, this actually starts, uh, the full, like, small VM on site, inside our cloud that 00:37:35.680 |
Um, it takes about currently 800, 900 milliseconds, uh, to start it. 00:37:42.680 |
And you can do it many, many times, uh, uh, at once. 00:37:46.680 |
So, usually what we see is that every user inside your app, every user session would be a separate 00:38:00.680 |
Uh, actually we need to, um, sorry, I, we need to call code interpreter dot create. 00:38:09.680 |
Uh, that was Python syntax, what I just wrote. 00:38:11.680 |
Um, uh, and this creates what I, what I, what I just said. 00:38:16.680 |
And there's one problem with this, is that, now, every time, uh, if I would, 00:38:23.680 |
uh, call this, uh, endpoint, post endpoint, and, on my back end, and, uh, I would just 00:38:31.680 |
call run Python function and then create a new sandbox, there will be no concept of context. 00:38:36.680 |
So, it would create a new sandbox every time. 00:38:39.680 |
And you would be just running the new code in a completely new sandbox every time. 00:38:43.680 |
You don't have any reference to past code snippets that are generated by the, by the LLM. 00:38:49.680 |
So, uh, for that, uh, you can, uh, actually have a sandbox running and reconnect to it a 00:38:57.680 |
little bit later, uh, by, uh, calling reconnect. 00:39:02.680 |
So, what we will change a little bit in our create or connect function is that we first check 00:39:07.680 |
if the sandbox exists for a given user ID, uh, which would be something like a user session, 00:39:14.680 |
uh, that you would implement, uh, once you have authentication. 00:39:17.680 |
And if it has, we will connect to the sandbox. 00:39:20.680 |
If it, uh, if it doesn't exist, uh, if it exists, we connect to the sandbox. 00:39:24.680 |
If it doesn't exist, we create a new sandbox. 00:39:44.680 |
And once we have all the sandboxes, uh, we can check if the sandbox has attached, uh, user 00:39:55.680 |
So, what you can do when you are creating a new sandbox is that you can add metadata to 00:40:01.680 |
Uh, and if the sandbox with our user ID inside its metadata exists, we want to connect to the 00:40:08.680 |
So we keep our old session and have the whole context there. 00:40:12.680 |
Uh, so first we want to find the sandbox if it exists. 00:40:21.680 |
Uh, so we call find on the all sandboxes, which is a list, uh, of information of all, all our 00:40:31.680 |
And, uh, and then, uh, we call, uh, metadata. 00:40:39.680 |
And on that metadata, uh, which is like optional, optional object, uh, or dictionary, uh, it 00:40:45.680 |
has, it should have user ID key that should be equal to the parameter that we will be passing 00:40:57.680 |
And we want to evade this whole full code, this code. 00:41:16.680 |
And once we have our sandbox info, uh, we check it actually exists. 00:41:26.680 |
And, uh, if it doesn't exist, that's the moment we want to create a new sandbox. 00:41:33.680 |
Uh, so we will be actually here, returning a new sandbox. 00:41:43.680 |
And as I said, you can add, um, metadata, uh, to the sandbox. 00:41:48.680 |
So we will just use this metadata to make a note that, sorry, uh, to make a note that, uh, 00:42:03.680 |
But in your application, you would have many, many users. 00:42:09.680 |
So that's a case when the sandbox didn't exist and we need to create one. 00:42:14.680 |
And this line will create a new sandbox in the cloud. 00:42:18.680 |
And in case, uh, we have a sandbox with a user ID inside its metadata, we want to just connect 00:42:28.680 |
And for that, we have, uh, a function called, uh, reconnect on the code interpreter object. 00:42:42.680 |
And that we can get from the sandbox we found on the line 11. 00:42:56.680 |
So what this, uh, does is that at first, we, okay, I don't know, we check, uh, all our 00:43:05.680 |
running sandboxes because we probably have separate sandbox for every user or users, every user session, 00:43:13.680 |
Um, and then once we have all the running sandboxes, we check the, uh, metadata of each sandbox and 00:43:21.680 |
check if a user ID, if the sandbox has the same user ID for which we are calling, uh, 00:43:30.680 |
If it doesn't, if such sandbox doesn't exist, we will just create a new one. 00:43:40.680 |
Uh, um, and return this whole, uh, promise and object from the promise. 00:43:51.680 |
sandbox for this user, we will just reconnect to the sandbox. 00:44:12.680 |
Cool. So there's one--actually, now we finally can get to implementing the write run Python 00:44:31.280 |
method. So once we have our code--oh, sorry, this doesn't need to be here. Once we have 00:44:39.180 |
our code for creating or connecting to our existing sandbox, we can just call it. So create 00:44:49.840 |
or connect to the sandbox. Inside this run Python function, we pass our user ID. And once 00:45:02.800 |
we have the sandbox, we can finally run the code. So let's also add a console log to make 00:45:09.640 |
sure that we know what kind of code we are running. And I mentioned that inside the sandbox 00:45:22.000 |
or the code interpreter sandbox in this case, we have a Jupyter server running. And we give 00:45:27.800 |
you programmatic access to the Jupyter server. So you can call notebook on a sandbox instance, 00:45:35.760 |
and call exec cell. And by default, this method will execute Python code. It can be--it can 00:45:43.460 |
be AI generated, human written, predefined, whatever you want. And we just pass our code there. And 00:45:57.420 |
we get results--result from it. And just need to evade that. And return the result. Probably we want to log it as well. 00:46:10.460 |
Yes. Yes. So it's really--it's really just like legit Jupyter notebook. We actually have a PR where you can connect to that notebook that's hosted inside the sandbox. It's probably at the moment the easiest way to 00:46:22.380 |
Is the--is the--is the--is the--is the--is the-- 00:46:26.380 |
It's really just like legit Jupyter Notebook. 00:46:29.720 |
We actually have a PR where you can connect through that 00:46:38.060 |
It's probably at the moment the easiest way how to implement 00:46:42.580 |
like a consistent-- sorry, implement persistence 00:46:48.180 |
when you have multiple different code snippets where 00:46:53.540 |
one code snippet can reference a variable from a different 00:47:02.600 |
you just want to keep usually what we have seen, 00:47:06.340 |
But it really depends always on your specific use case. 00:47:09.100 |
Anyone struggling with the run Python implementation? 00:47:20.280 |
So this is the exact cell capture standard out. 00:47:24.420 |
But you also mentioned standard error from these. 00:47:29.200 |
The result is-- so the question is, if exact cell returns-- 00:47:34.780 |
it seems like it just returns standard output, right? 00:47:38.380 |
So what it returns is our custom object or execution. 00:47:42.720 |
And there, you have access to standard output, error output, 00:47:47.280 |
any runtime errors that are nicely parsed with traceback 00:47:53.580 |
that you can feed back into LLM to fix itself. 00:47:56.480 |
And also, any rich output, like PDFs, charts, PNG, JPEG files. 00:48:03.940 |
And we'll be just sending this to the front end where we can use it 00:48:08.460 |
So this run Python function, it's the function that we are importing 00:48:20.540 |
on the left in our endpoint when we are calling the LLM where we 00:48:27.960 |
And we can finally-- we can-- we can finally call the run Python function 00:48:38.220 |
So there's really nothing, like, too special about it. 00:48:46.300 |
One thing that we need to do, though, is that we need to use that user ID. 00:48:50.680 |
So we also need to get user IDs from somewhere. 00:48:56.320 |
We will be just sending this from the front end. 00:49:00.560 |
So let's say in a real application, we probably have, like, 00:49:08.120 |
In this case, we'll just send it from the front end. 00:49:13.620 |
And that user ID, we pass through that run Python method. 00:49:20.120 |
And once we have results, we want to return the results from the execute function. 00:49:27.120 |
So when we call return here and we are returning an object, that's what we will 00:49:40.140 |
So that's, like, the result of the function call, our tool call. 00:49:47.700 |
So just to make sure I don't forget anything-- yeah. 00:49:56.200 |
So is that part clear, how we are importing the run Python function to our post-- 00:50:06.240 |
And just we want to call that with the AI-generated code and then get the results 00:50:14.260 |
So now we are getting to more interesting part. 00:50:17.100 |
And once we implement this, we will have, like, actually running it inside our app. 00:50:29.840 |
And we can get different types of output from the results. 00:50:39.280 |
And the result object from our exec cell, we are just returning here, contains logs. 00:50:59.760 |
And it also contains two more things-- any runtime error. 00:51:05.200 |
So that's an error that you would get when AI probably generated wrong code or code that doesn't 00:51:13.380 |
work, we can catch that and we will return that in an error field, which is nicely structured, 00:51:27.480 |
And the last part is that we-- we get something-- I have a little bit unfortunate naming here-- called 00:51:36.840 |
Cell Results, essentially, so let's name it Cell Results. 00:51:43.440 |
And those are, like, evaluated-- it's an evaluated notebook cell that's inside a sandbox. 00:51:53.900 |
And those can be those charts, images, PDF files, HTML, JSON. 00:52:00.940 |
So it's usually the last line of a Jupyter cell in a Jupyter cell-- in a Jupyter notebook. 00:52:09.680 |
And also any time you call, like, a display data, like, you want to display a chart or anything 00:52:15.200 |
like that, those-- that will be in a cell results, which is, like, an array of results. 00:52:22.720 |
And based on a type of a result, you can then parse it. 00:52:29.540 |
And all these four objects, we just return it from the execute function, which will get 00:52:47.720 |
Can you speak to, like, Python version packages and any of that stuff? 00:52:52.540 |
So the question is about Python version packages and everything, probably, inside a sandbox and 00:53:00.860 |
For the Code Interpreter SDK-- so I mentioned that the Code Interpreter SDK is wrapping a sandbox. 00:53:09.500 |
And that sandbox can be completely predefined by you. 00:53:13.480 |
In our case, when we created a Code Interpreter SDK, we installed a bunch of packages that you 00:53:17.900 |
would probably usually use in, like, a data science use case or AI data analysis use case, 00:53:24.420 |
like NumPy, Pandas, Seaborn is installed there. 00:53:30.180 |
And so that's, like, something that gets you going out of the box. 00:53:33.660 |
And there's also Python 3.10 installed in the environment. 00:53:38.000 |
You can also create a completely custom sandbox if you want to. 00:53:41.960 |
If you just give us a Docker file, and inside the Docker file, you can install whatever package 00:53:48.800 |
you want, but you just-- you will be just using a little bit lower level API that we have. 00:53:56.300 |
It's just, like, you need to set up everything yourself. 00:53:58.300 |
So usually you start with the Code Interpreter SDK, and then once you find out, hey, I don't 00:54:01.880 |
need these packages or I need different, you can install whatever you want. 00:54:07.680 |
I was curious on your choice of the Jupyter frame. 00:54:12.800 |
You mentioned that it was your plans, is there any other rational ? 00:54:20.800 |
So, here we really went just with an optimized-- sorry-- the question was how we went about choosing the Jupyter frame. 00:54:34.800 |
the Jupyter framework inside the sandbox, if there's any more to it than what I mentioned. 00:54:41.640 |
So, the main decision was how we should set up the sandbox that out of the box, it's working for you when you plug it to an LLM and you don't get much, like, errors. 00:54:54.480 |
Because that's what users come to us, especially if they are a little bit less experienced with LLMs and they are wondering why it's not working. 00:55:03.480 |
So, we were really optimizing on how most likely will AI generated Python code or code will look like. 00:55:10.320 |
And the answer is, it's going to be Python code and it's going to be like a Jupyter-- Jupyter notebook code with preinstalled packages. 00:55:18.320 |
It's essentially like what you-- GPT-4 would expect. 00:55:22.320 |
and that means what their code interpreter tool would have installed. 00:55:30.160 |
Are you able to expose the Jupyter environment as a notebook to use that? 00:55:34.880 |
So, the question is, if you are able to expose the Jupyter notebook inside a sandbox. 00:55:44.720 |
So, that's not, like, documented anywhere, but you actually can set up exactly this because we have a few customers that want to implement something like this. 00:56:02.560 |
The answer is just text us on Discord or just send us an email and we will give you a guide. 00:56:08.560 |
We have, like, few experimental PRs like that, for example, we have a sandbox with full working graphical interface, where you can start apps, like, game-- so, it's-- everything is Linux, but you can pretty much start anything you want there. 00:56:23.400 |
A lot of people have been using it for-- or have been using it for evals, when you want the LLM control, something more graphical. 00:56:32.400 |
Or if you want to build something that's more like a tandem human in the loop, you can do that as well. 00:56:37.400 |
So, just to sum it up, we implemented the post request and execute-- sorry, the run Python tool. 00:56:50.240 |
That's-- finally, now, we should be able to run the code on the front end. 00:56:54.640 |
So, one last thing we need to do is that we need to go to our page component, page.tsx. 00:57:05.200 |
And there, we need to create this, like, a dummy user ID. 00:57:18.040 |
This would be-- because we are now passing it on the front end, in your real app, you actually had-- you would actually have a user ID. 00:57:24.040 |
And in the use chat hook, that's coming from the AI SDK, we just passed the user ID to body. 00:57:36.880 |
And let's add a console log here to the messages, so we can check if everything is working correctly. 00:58:00.720 |
Open my Chrome DevTools, and I see something like simple, like, print hello world. 00:58:24.400 |
And if I check the messages-- the reason there's so many messages is because, like, it's constantly streaming then, so it's updating the message-- messages object. 00:58:32.400 |
And so one message-- first message is my prompt, print hello world. 00:58:43.520 |
First message is print hello world, and the second message is from the Assistant, which has the answer, 00:58:49.240 |
and there's also something interesting called tool invocations, which is an array of all the tool invocations that the Assistant or LLM decided to do. 00:58:59.000 |
And one of them-- the only one there is the run Python that we defined. 00:59:05.720 |
It has two important fields, args and result. 00:59:11.480 |
So args, that's what we described when we were defining that tool, that run Python tool. 00:59:23.560 |
And the result, that's what we returned from that execute method inside that post function or execute function. 00:59:31.640 |
And there we have cell results, like an empty array, because really-- so cell results doesn't capture any standard output. 00:59:38.680 |
It just evaluates the Python code, and there was nothing to evaluate, like, as a side effect of print, it's just nothing. 00:59:45.640 |
No error output, but in the standard output, we have hello world. 00:59:54.440 |
So now if we ask it to actually-- for actually something more complex, like calculate by using Monte Carlo method and visualize it. 01:00:13.560 |
Now it's writing the code, which unfortunately isn't streamed at the moment, because the tool streaming isn't implemented there in the AR SDK. 01:00:23.880 |
But if we check it out, we have a tool invocation here, which is our run Python. 01:00:34.520 |
It's taking 1,000 or 10,000-- 10,000-- 100,000 iterations. 01:00:42.280 |
So it will actually most likely take a little bit of time to finish this. 01:00:47.080 |
And if we-- yeah, and we got two new messages, because we got a result from the-- from the run Python code. 01:00:55.880 |
And if we check it again, there's a tool invocation. 01:01:04.840 |
But the interesting part is that in cell results, we now have a-- 01:01:09.880 |
one of the results which has PNG and text fields. 01:01:20.200 |
And that's the visualization of the Monte Carlo simulation. 01:01:24.440 |
And the text is just like a human-friendly description of the chart. 01:01:28.760 |
And now what we will work now on in the next step is displaying this on the front end, on the right side, as in the artifact. 01:01:45.800 |
If you check out the cell results, you see the PNG. 01:01:53.240 |
The reason we don't see it anywhere here on-- inside the app is because we haven't implemented the front end yet. 01:02:01.400 |
But we already have results from the code interpreter sandbox in our messages variable. 01:02:10.840 |
So the question is, where did I get the messages from? 01:02:18.200 |
This line, the use chat line, is returning all the messages from the LLM, including tool calls. 01:02:43.720 |
So I would say we are pretty near the-- the end now. 01:02:52.200 |
And for that, we need to parse the tool invocation from the-- from the message. 01:02:59.560 |
Latest message with-- we need to get the latest message with the tool invocation. 01:03:02.680 |
So I will, like, for the sake of simplicity, uh, I will just do something a little bit naive here. 01:03:09.480 |
And that is, uh, I will just care about the latest message we get from the LLM. 01:03:15.480 |
And from the latest message, I will just parse the tool invocations that was there. 01:03:22.680 |
So first we get the latest message that has tool invocation. 01:03:32.360 |
And, uh, this is a little bit annoying, but first we need to reverse the array 01:03:43.960 |
in JavaScript, um, and then we search in this for message that had tool invocation field. 01:03:52.840 |
And that field wasn't empty, uh, so-- that array wasn't empty. 01:04:06.360 |
So this can or might or might not exist, uh, this latest message, uh, with tool invocation. 01:04:12.760 |
It will not exist, uh, in the first few seconds once we, uh, give an LLM a prompt 01:04:24.360 |
And now we just want to extract the tool invocation field, uh, from our, uh, from our, uh, message 01:04:39.000 |
with tool invocation and it, again, it can or might or might not exist. 01:04:44.360 |
So both, uh, latest message with tool invocation can be message or undefined and the tool invocation 01:04:56.760 |
So we, so far we just added these, these two, uh, or like two, uh, function calls or two calls. 01:05:13.160 |
And we are working with messages that were sent from the back end. 01:05:19.800 |
And that's the part that we implemented, uh, up until now. 01:05:24.280 |
The reason we are doing this is that we will be sending the tool invocation to the 01:05:32.040 |
component, uh, that, uh, I will add in a sec. 01:05:36.120 |
And that component will be showing the chart that, uh, and any output from the running the 01:06:06.920 |
Once we have a tool invocation, um, we want to, uh, display it. 01:06:13.240 |
And for that I have, uh, a premade component called side view, uh, that exists in components 01:06:25.320 |
And we, so, so we can just, uh, import it from there. 01:06:34.280 |
And we want to render it right next to the chat, um, component. 01:06:46.520 |
So, and this side view takes the tool invocation. 01:06:49.720 |
That's why we did what we did, uh, with those messages here. 01:06:56.680 |
Uh, it's a component that's, uh, already exists inside a, inside a project. 01:07:02.200 |
And I'm now rendering the side view, uh, next to the chat, uh, component that we already have there. 01:07:22.440 |
And if you check out the app, uh, when you, when we ask, 01:07:27.480 |
when we ask the Claude to print something, oops, yeah, it's working now. 01:07:41.720 |
It should display the side view, but for some reason it's not, uh, showing now. 01:07:48.280 |
And that might be, I forgot something to implement inside a side view. 01:07:51.800 |
Um, do you need more time to add the side view, uh, to this file? 01:08:06.520 |
Uh, the side view has, uh, uh, one thing that I forgot. 01:08:14.120 |
And that it also expects, uh, a field called data. 01:08:19.000 |
And what that, uh, field does is that we want something that shows us that, um, the tool, 01:08:26.440 |
that the LLM is using a tool and we are waiting for it. 01:08:29.640 |
And we want to know about it on the front end. 01:08:31.400 |
So we want to send some additional data with the LLM's response from our server. 01:08:41.160 |
Um, and unless the, the, the, the idea is just simple. 01:08:55.560 |
And, uh, uh, I want to show you how you can, uh, know about what is happening with the tool 01:09:03.800 |
execution more than just like waiting for a new message. 01:09:06.520 |
So for that, uh, if we go to back to route file. 01:09:13.240 |
And there we have imported, uh, object called stream data. 01:09:19.720 |
Uh, stream data is a helper object from the AI SDK. 01:09:24.760 |
And that will help us to stream any, uh, arbitrary data back to the front end. 01:09:34.920 |
So we will just, here, uh, below, uh, parsing the request. 01:09:43.960 |
We will just create a, uh, data objects called stream data. 01:10:01.400 |
And we can append, uh, objects into that stream. 01:10:04.440 |
So what we want to do is when we are calling execute. 01:10:08.760 |
Uh, or when, when the function, uh, the tool gets called. 01:10:13.880 |
We want to first say, hey, the, uh, tool run Python is running. 01:10:29.080 |
And once we get what we need, we also want to update the stream with, uh, 01:10:49.720 |
And it's just like the, how the site view, uh, component is implemented, uh, uh, 01:10:58.760 |
But the idea is that, like, now you can use the stream data to send, uh, arbitrary data from the 01:11:05.320 |
backend to the front end and stream it, which is an important part. 01:11:09.080 |
What is just missing now that we need to include that stream data object to our response from our, 01:11:19.720 |
And for that, we need to just, uh, do two things. 01:11:25.160 |
First, we need to close that stream data, uh, stream. 01:11:29.640 |
Uh, because it's a stream, so it's currently open when we created it. 01:11:32.520 |
And we also need to change the, uh, uh, return here to include the data that we, uh, the stream, 01:11:42.520 |
So we kinda need to create two streams that we are, uh, in a, in a, combine them in a single 01:11:50.920 |
And, uh, for that, what we are going to do is, uh, first the result from the LLM call, 01:12:00.840 |
uh, which is this result from the stream text, can be converted to the stream, to a stream. 01:12:16.040 |
And there, uh, we can do one thing and that's, uh, we can pass it a callback called, uh, 01:12:25.800 |
on final, which is once everything is done, we want to close our, uh, data stream. 01:12:42.600 |
Okay. Um, so this, uh, what this does is that it converts, uh, a result from the LLM that we 01:12:52.200 |
are using with, uh, we get from stream text to a stream. And, uh, once LLM is finished with 01:12:59.640 |
generating all the, uh, all the responses, uh, that's what this callback on file tells us. 01:13:05.400 |
We want to close our data stream that we created, uh, a few months ago. 01:13:11.080 |
And the last missing part is just now we want to, uh, we want to return both the stream and the data. 01:13:17.960 |
So, uh, for that, uh, there's a object called streaming text response from the AR SDK. It's already imported. 01:13:34.120 |
And if we call it, uh, you, you can pass it three, uh, should pass it three parameters. First is the 01:13:40.920 |
stream, which is the response from the AI. Second is the initial data. We don't want any. 01:13:47.320 |
And the last is any, uh, uh, arbitrary data that we want to send along the stream. 01:14:17.720 |
What check time are you on? Uh, this should be stage, uh, uh, workshop dash, uh, uh, four. 01:14:36.840 |
So actually on the page.tsx, what we need to do is we need to, uh, get the data object from the 01:14:58.520 |
used chat hook. Uh, so that's what we are going to add now. 01:15:18.840 |
Uh, sorry? The question is if we have modified site view, uh, component. We haven't yet. Uh, 01:15:28.760 |
we just, uh, are rendering site view component here. But it needs additional data about the information 01:15:35.480 |
if the, uh, code is, uh, still running, uh, uh, the AI generated code. And that's what we are adding now. 01:15:43.480 |
And once we add that, it should, it should display. Uh, so if we, uh, one last part for this data 01:15:55.640 |
stream, uh, is that we, uh, need to extract the data from the, that we are sending from the backend, 01:16:03.320 |
and get it on the frontend. So use chat, again, from the AI SDK is pretty handy here, because, uh, 01:16:12.120 |
uh, we can just add, it just returns another field called data, which is exactly what we are sending. 01:16:17.720 |
Uh, we can even print it. And, uh, all we need to do is, once we get the data variable, 01:16:38.280 |
And now if you go to the app, again, and ask it anything that would result in generating code, 01:16:52.840 |
uh, it's a little bit, uh, Sonnet is a little bit slow. 01:17:00.600 |
And now we have the side view rendered. And it's currently just showing code. Uh, 01:17:08.440 |
but what we will add as the last part of this workshop is the preview. And there will be 01:17:14.280 |
rendering, uh, any charts returned from the code interpreter and any standard output and, uh, 01:17:30.040 |
Uh, the question is what's special about anthropics, uh, cloud, uh, model here. 01:17:46.680 |
Uh, you could use any model you want. Uh, it's just the new Sonnet model is really good with code 01:17:54.360 |
generation. And so it usually, uh, is capable of one shot, same, like, relatively, I would say, 01:18:03.080 |
like, advanced examples. Not like full projects, but it can give you interesting results. 01:18:21.720 |
You mentioned that all of your run times are Linux at the moment. And I wondered, uh, 01:18:26.120 |
I think that there were some security reasons that that was the case. And I wonder if you could 01:18:31.800 |
Yeah. The question is, uh, that all the sandbox run times are Linux. Uh, and what are the reasons 01:18:40.040 |
for that, uh, and probably if, if those are security reasons. So yeah, that's correct. All those run times 01:18:45.720 |
are, are Linux. And so there are two parts, uh, of the, of the question and the answer. So, uh, the sandbox 01:18:54.600 |
is a regular VM. It's not a, it's not a container. Um, and that VM is running on our server, on a host 01:19:02.760 |
machine. And so both host machine and the sandbox operating system is Linux. Uh, the reason for that 01:19:10.120 |
is that we are using something called Firecracker, which is an open source, uh, VM from AWS. And they 01:19:17.560 |
are using it for running, uh, AWS Lambda. So it's really battle tested with running, uh, untrusted code. 01:19:25.080 |
And Firecracker is very specific in a way that it requires, uh, Linux with specific kernel for security 01:19:34.440 |
reasons. And inside the Firecracker, inside the VM, you really can't run anything else at the moment than, 01:19:40.680 |
than Linux. Yeah, sir. Uh, go ahead. So you mentioned the dependency law of anything that's defaulting 01:19:57.800 |
sandbox, right? So how hard would it be in this period to read the imports and customize the sandbox? 01:20:06.200 |
Uh, it's basically just your custom Docker file. So the question is, um, how hard would it be to 01:20:11.880 |
customize the sandbox, right? Uh, with install, uh, custom dependencies. Uh, you would just give, uh, 01:20:22.520 |
give us, uh, your Docker file. And inside the Docker file, you would have something like pip install, 01:20:27.480 |
whatever you want. Or you can have even NPM, uh, if you want to build a little bit more communication on 01:20:33.480 |
top, you can, we have users running, uh, sorry, uh, Fortran. So, uh, you can really run anything you 01:20:42.040 |
want, uh, depends on how much, uh, customizability you want. Uh, I can show it after, after the call, 01:20:49.720 |
but if you go, uh, sorry, after the, after the, uh, workshop, if you go to our documentation, 01:20:57.080 |
e2b.dev/docs, we have a guide there for customization, uh, uh, specifically for the code interpreter SDK. 01:21:09.240 |
So the question is that, uh, if we are using just firecracker or firecracker and containers 01:21:25.080 |
together, uh, I think like the confusion comes from the Docker file I mentioned. Yeah. So, uh, we are using 01:21:32.680 |
just firecracker, uh, the containers and Docker file, uh, we are using that only for like a simple way to 01:21:42.920 |
define your firecracker environment, essentially. So, what we do on the background, uh, once you give us 01:21:49.320 |
your Docker file to customize the sandbox, is, uh, we start it as a container, extract the file system, 01:21:57.400 |
um, and then convert this to, uh, to filecracker VM. And, um, then when you start your sandbox, 01:22:04.360 |
you are essentially starting your custom container, uh, programmatically. 01:22:10.440 |
What's, like, the starting time in customizability? 01:22:14.760 |
Uh, what's the static, uh, what's the timing? 01:22:19.240 |
Oh, yeah. Um, so how much time does it take to start the send, uh, customizability? 01:22:24.040 |
It takes the same time as, uh, pre-made sandbox. Uh, so, at the moment, it should be around 900 01:22:31.240 |
milliseconds, uh, every time you call that, uh, sandbox.create that I wrote there, uh, in SDK. 01:22:38.360 |
And, uh, soon it should be around 400 milliseconds. 01:22:42.680 |
Okay. Now we are getting to the last part, uh, of this workshop. And that's in case you want to just, 01:22:53.800 |
uh, check out, uh, get check out to the last part. It's, uh, workflow dash five. 01:22:58.280 |
And we will go into the site view. And what we want to do, we want to implement this preview. 01:23:04.840 |
Which, uh, that's, uh, so that's essentially the result of the code execution that we, uh, get from the, 01:23:12.920 |
from the LLM. Um, let me close a few files first. 01:23:20.840 |
So, uh, so, uh, we really care about the site view component. Uh, uh, the only thing we care, uh, for now. 01:23:28.280 |
And if we open the site view component, uh, it's like scaffold and predefined. Uh, the, uh, thing that's 01:23:40.440 |
missing there is that we have, uh, artifact, uh, sorry. We have a artifact view, uh, component that's also 01:23:49.000 |
already exists. That's currently commented out. So, that we need to uncomment inside the site view. 01:23:55.080 |
And inside, uh, let me turn that on left, uh, put it on left. And inside the 01:24:06.680 |
artifact view, that's on the right now. Um, if you go to artifact view function, which is the component, 01:24:17.480 |
we have a bunch of to-do's. So, let's, uh, focus now on one main, uh, to-do and, uh, item. And that's 01:24:25.880 |
render image. And so, let's just optimize now for the use case where we want to LLM to render a chart or 01:24:32.120 |
any image. Uh, any image that we, uh, that we want. And we want to display that image here. 01:24:39.320 |
So, what artifact view gets through props, uh, propagation is, uh, the result of our code execution. 01:24:49.000 |
That's the result here. Uh, that's what we send from the, from the backend. Uh, that's this file. 01:24:59.080 |
Um, and we want to access the cell results. Because cell results, uh, has, uh, this object has 01:25:06.840 |
all the PNG files, JPEG files, HTML, PDF, everything around that. Um, so, uh, you can see that we are 01:25:17.880 |
already doing the parsing here. Uh, when we are extracting the cell results, standard output, error output, 01:25:24.520 |
and runtime error from a result object. And if there are any cell results, uh, we just expect 01:25:33.480 |
there will be PNG. Uh, in a real app, you probably would need to check that, uh, a little bit more 01:25:39.000 |
thoroughly. And if we have a PNG file, we'll just, uh, render image. And that's basically it. So, uh, 01:25:49.400 |
uh, I will just add a little bit more styling here. Um, we will use next.js image component. 01:25:57.960 |
And because the image, uh, the image is PNG image is in base 64, uh, that's exactly what we are going to 01:26:10.040 |
use here. And the width and the size will be like 600, 400. It's really arbitrary for now. And, uh, 01:26:34.040 |
we just want to put this image in a diff container. 01:26:58.760 |
And render our logs there. The logs output is something that's already prepared here in this file. 01:27:05.400 |
So, uh, once we add this, um, when we, we should see an image, uh, inside our app, 01:27:18.280 |
if we ask for something like that Monte Carlo, or just like, uh, generate or create 01:27:26.440 |
Uh, here it is. So this came straight from the sandbox, uh, from the, uh, 01:27:41.000 |
Jupyter server running inside the sandbox. Uh, the next step would be that we won't be able to cover 01:27:49.240 |
in this, uh, in this workshop. Uh, but next step would be how to make these charts interactive. 01:27:54.120 |
Uh, because now it's just an image generated from the, from the sandbox. 01:27:59.000 |
One of the ways to do that, uh, I'm going to go back to code. Um, uh, one of the ways to do that would be, 01:28:05.720 |
uh, use the sandbox not for generating images, but to operate on top of charts, uh, sorry, on top of CSV 01:28:13.960 |
files on your dataset. So what you can do is that you can upload files to the sandbox. You can connect 01:28:21.320 |
a cloud storage to a sandbox that you already, or you already have a bunch of files. And, uh, you can 01:28:28.040 |
just let AI know at the LM know about how these files look like, like CSV files, what columns are there. 01:28:36.360 |
And just ask it to questions about this dataset. And it was usual like Sonnet 3.5 is very capable 01:28:44.120 |
of this already. It will start generating, um, code that's capable of extracting data from your CSV file, 01:28:53.000 |
send the data, run that code inside the sandbox on top of your CSV file, send it to a front end and on 01:28:58.840 |
the front end generate, uh, or display this data with something like, uh, char.js podly, uh, any library, 01:29:11.480 |
Does core interpreter have table previews, you can display previews? 01:29:16.040 |
Uh, if we can display table previews, uh, previews. Uh, so the, yeah, the question is if we can display 01:29:21.480 |
table previews, yes. So anything you could display besides, like, interactive widgets, widgets, 01:29:26.840 |
uh, inside, uh, Jupyter's, uh, notebook, uh, you will get in this result object or cell results object. 01:29:33.720 |
Uh, uh, uh, and it can be data frames, for example. 01:29:37.240 |
Uh, uh, the question really is, uh, it depends, uh, sorry, the, the, the question is if the table preview 01:29:50.280 |
is an image or, uh, something or just like a description of the table. Uh, the answer is, 01:29:56.680 |
it depends. It depends on what you tell the LLM to do. So if you ask the LLM to, um, use the right 01:30:04.760 |
libraries, uh, that when are evaluated return, for example, HTML, you will get HTML on front end and 01:30:11.880 |
you can just display it. Um, so, yeah, it depends on how the AI generated code looks like 01:30:18.120 |
and what kind of libraries you are using and dependencies you are using inside the code interpreter. 01:30:23.960 |
Yeah, uh, question here. Do you want to say anything more about security architecture? 01:30:30.280 |
I mean, you've given us, I guess it's really simple, but it's just executing in a sandbox, 01:30:36.680 |
but do you want to say more about it? Yeah, so the question is, uh, 01:30:40.040 |
if I can go a little bit deeper into the security of the sandbox. Um, so, uh, 01:30:47.240 |
firecracker really does very heavy lifting here. And, uh, the way it works, uh, and our infrastructure 01:30:54.760 |
works is that you start a sandbox. We, uh, have a small VM, uh, prepared for you. Uh, and that VM is the 01:31:07.560 |
firecracker VM. Uh, and the VM is designed in a way that when you try to get outside of the VM, 01:31:14.520 |
it just restarts. Uh, so we are actually giving you full root access. You can run any comment you want. 01:31:20.840 |
And when you, unless there's like a network, um, hole somewhere where you could just connect to a 01:31:28.600 |
third party service, uh, when you try to something like exit or system rest or anything like that, 01:31:34.600 |
the, the, the machine will just restart and will just start the new sandbox for you. Uh, on top of that, 01:31:40.760 |
that, uh, firecracker is wrapped inside the jailer. So, which, uh, removes ability to run certain syscalls, 01:31:52.040 |
which, uh, usually you shouldn't really care about. Everything just works, uh, as, as you would expect. 01:31:58.760 |
And, uh, the way firecracker, uh, works for the, like, high level security is that it's using 01:32:06.360 |
Linux's and, uh, kernel's KVM, uh, for virtualization. So you are, there's a little bit more overhead 01:32:13.960 |
instead of containers because it's a full VM. It's not just a process. And so every, like, user session 01:32:21.000 |
is a separate VM. So, uh, that is isolated and, uh, can't, uh, really doesn't know about other 01:32:29.160 |
VMs or sandboxes, uh, inside the network. So it's out of the books multi-tenant environment. If you look, 01:32:35.640 |
uh, at the sandbox as a single tenant, uh, it's a multi-tenant environment, uh, if you look at a 01:32:41.640 |
sandbox as one of the tenants. Yeah. Question. 01:32:46.040 |
Do you save the data, do you log that in your system? 01:32:52.440 |
Uh, the question is if we save any data that you sent to the, uh, to the code interpreter 01:32:58.760 |
sandbox or to the sandbox? Uh, the answer is no. Uh, so at the moment, when you start the sandbox, 01:33:07.960 |
um, and you kill the sandbox at some point, or it just closes by itself, uh, everything is destroyed. 01:33:15.400 |
Uh, and what we log is only that you did certain operations. So you, for example, like, uh, 01:33:22.280 |
created a file, but we don't save, like, content of the file. We just want to, like, know about what 01:33:28.280 |
is happening inside, uh, inside the VM. And, um, what you can do for the persistence, uh, is you can 01:33:36.680 |
connect your cloud storage to the VM and save the data there. 01:33:44.520 |
Uh, can you, can you, sorry, repeat a little bit longer? 01:33:57.800 |
Yeah, so the question is, uh, that I mentioned docker files for customizing the sandboxes, 01:34:10.200 |
the VMs. Uh, uh, yeah, so, um, uh, actually, uh, I can show it to you here. If you go to 01:34:19.880 |
e2b.dev/docs, what you can do, uh, there is we have a way to customize the code interpreter. 01:34:30.120 |
And the way we do it, um, is through a docker file. And, uh, it's a regular docker file where you can 01:34:38.440 |
use most of the docker file things, uh, and, and keywords. Uh, it just needs to be Ubuntu based. And 01:34:44.360 |
uh, basically, uh, what the docker file is, uh, good for is that you can define the, like, a file 01:34:51.640 |
system of the VM, the environment of the VM. And, uh, that can be, like, you can install any packages, 01:34:58.280 |
you can, uh, define any environment variables, uh, save any files or pre-save any files. So we have 01:35:04.280 |
users save, like, a scaffolded Next.js app that just the LLM can send render, uh, uh, generated components. 01:35:13.240 |
And they, then the LLM, uh, the Next.js app is just running inside the, inside the VM. And so the 01:35:19.720 |
docker file really is just like a means of letting us know how you want the environment to look like. 01:35:28.040 |
Um, and then we just convert the docker file, actually that container to a, a VM. Does that answer the 01:35:36.120 |
a question. Here's a question. Is the conversion from Dockerfile to firetack.com, is it a standard 01:35:43.120 |
process or is it a standard process or is it a standard process or is it a standard process? 01:36:02.120 |
The question is, how does it look like that conversion from a container to a sandbox, right, 01:36:11.120 |
like that file system? It's actually much simpler than it sounds. So what we really do is just, 01:36:17.120 |
you know, like, you know how you have different types of file systems in your computer? I don't 01:36:26.120 |
think that many people touch it nowadays, but back in the days if you are booting your machine, 01:36:31.120 |
you can pick how your file system would look like. That's what essentially we do. So we take, 01:36:38.120 |
we run the container and on our infrastructure there's literally a shell command that copies 01:36:46.120 |
the whole file system to an outside file outside from the container. And we just convert that file 01:36:52.120 |
to that correct format. And that format is supported as a root file system for the firecracker. 01:36:59.120 |
So it really is just a bunch of files together that we extract from the container. And it's very 01:37:08.120 |
similar to how the container gets created. So if you want, you could theoretically, you know, 01:37:14.120 |
images have this bunch of tar files and layers. You could extract those layers from the image. And you 01:37:21.120 |
wouldn't probably even, most likely didn't need to run the container. And you could just take the image and 01:37:31.120 |
Yeah, it's in the code base. So we are fully open source. We have a bunch of repositories here. There's a 01:37:42.120 |
repository called infra, which is a little bit less known. It's like one disclosure is it's not that now 01:37:50.120 |
very friendly to get it running on your own. That's the next thing we are working on. And here, if you take a 01:37:58.120 |
look into, especially scripts, you will find a script for conversion of the Docker file to the 01:38:07.120 |
sandbox, the VM system. There was one question first. 01:38:13.120 |
Yeah, how do you end up registering your own custom modules inside the sandbox? 01:38:19.120 |
Yeah, so the question is how do we end up registering custom modules inside the sandbox? 01:38:25.120 |
Very similarly, like what I talked about it a few times, when you are creating a -- you can create a 01:38:33.120 |
Docker file that describes how the sandbox will look like. We convert the Docker file to a 01:38:39.120 |
your custom sandbox. And then we start a sandbox -- you can start a sandbox using the SDKs. Because the 01:38:49.120 |
Docker file creation happens with your Docker instance. We are not doing it on our cloud. You can use 01:38:56.120 |
private packages or private images or whatever you really want. Yes, question. 01:39:02.120 |
So I was looking at this report earlier, and it says, I think at the bottom, that GCP is kind of the only 01:39:08.120 |
platform that's supported currently. Do you have plans to support others in the future? 01:39:12.120 |
The question is if we have plans for supporting other clouds than GCP. Yes, strong yes. So AWS is actually 01:39:19.120 |
where we are even migrating with our cloud version. We want to support AWS as the first thing. Then we will have a 01:39:29.120 |
proper tutorial for GCP. And the next one is supporting any Linux machine. So custom on-prem clouds. 01:39:39.120 |
And just as a follow-up. So is your business model eventually -- so, you know, instead of firms trying to build this 01:39:46.120 |
capability themselves today, there aren't any options for this. You want to sort of sell this as open source 01:39:51.120 |
but support it as well? Is that where the revenue would come from? 01:39:56.120 |
So the question is about business model. Yes, so two parts. It's pretty similar or pretty much any open core business 01:40:08.120 |
model. So most features are open source. If you can self-host it on your own and manage it on your own, or you can use our cloud. 01:40:18.120 |
There will be a few features more targeted for enterprises. Especially around, like, we have -- we got questions 01:40:26.120 |
about how do you load, like, terabytes of data into a sandbox. That's not something that you encounter during your 01:40:34.120 |
weekend hacking or something like that. And then we are also planning a lot of work on top of 01:40:40.120 |
observability. Because, like, as the LLMs are getting better, you will want to know what is happening inside the 01:40:47.120 |
sandbox. Because it will not be just, like, running a simple Python script. It will be more like a workspace. 01:40:53.120 |
Like a permanent workspace for your agent. And your AI app. And you kind of want to know what is happening 01:41:01.120 |
inside the sandbox. Like, are any files getting created? Is any network requests happening? You want to have a 01:41:09.120 |
programmatic access to this information. And you want to be able to, like, stop the sandbox before it actually 01:41:16.120 |
happens if you don't like it. So those are, like, some of the things that we are working on next. And 01:41:20.120 |
might -- some of them might stay, like, private or behind the license. 01:41:29.120 |
This summer. This summer. Yes, question there. 01:41:36.120 |
So the question is, if we have any GPU sandboxes? Correct? 01:41:48.120 |
Not at the moment. So that's very intentional. Because we kind of don't want to go into, like, 01:41:56.120 |
a whole GPU provider business. So -- and even, like, now, usually, if you have -- if you are serious 01:42:04.120 |
about GPU work, you want to offload it to someone who does it really, really well. And there's, like, 01:42:09.120 |
a lot of players in the space. So, yeah. The short answer is no. The long answer, it makes sense 01:42:18.120 |
for the future, but not something that we are focused on right now. Yeah. Question there. 01:42:24.120 |
Yeah. Can you say a bit more about connecting these cloud data schools to the sandbox? 01:42:35.120 |
You mentioned that you can connect cloud data schools. 01:42:39.120 |
Oh, yeah. So the question is how you can connect something like S3 or good cloud storage. 01:42:44.120 |
What kinds of schools can you connect? What kinds do you recommend? 01:42:48.120 |
Yeah. So I will show you -- we have a guide for this in our documentation. So currently, 01:42:57.120 |
you can connect anything that looks like S3 or, like, has S3 API. Mainly, that's Google Cloud Storage. 01:43:07.120 |
That's Amazon S3, obviously, or AWS. And Cloudflare's R2. 01:43:16.120 |
There's a little bit more complicated setup. In the future version of the SDK, it will be 01:43:21.120 |
just, like, a simple call that you say, hey, I want to mount this endpoint. And you need 01:43:27.120 |
to basically -- what we are doing is we are using something called Fuse Protocol for that, 01:43:33.120 |
which allows you to connect your cloud storage to a file system. And then it sort of looks like, 01:43:41.120 |
basically, that it's part of your file system. But actually, when -- every time you are reading 01:43:46.120 |
or writing to a file inside a cloud storage, you are making a network request. But the nice 01:43:53.120 |
thing about that, especially if you are, like, more, like, enterprise customer and you have -- you 01:44:00.120 |
care about your users' data is that the data doesn't really leave your storage. 01:44:12.120 |
So, I understand you are using games for the file record. Is there a reason you are not planning 01:44:22.120 |
So, the question is, if there is any reason, or what is the reason we are not using containers? 01:44:29.120 |
I have a separate presentation on that, but -- and the short answer is security. So, we -- when 01:44:39.120 |
customers come to us, they come to us 90% -- it's an in-house solution that's either serverless function, 01:44:48.120 |
like a Lambda function, or they are managing a fleet of containers or something like that, with Kubernetes. 01:44:57.120 |
-- and you can sort of make containers secure. It just takes -- it just needs more work. And you 01:45:06.120 |
then want things like -- like running Docker in Docker -- it's hard to run, for example, Docker, 01:45:12.120 |
inside Docker. Like, it's doable, but it's just, like, more pain. And so, ergonomics and security 01:45:20.120 |
is, like, part of the answer, where we eventually figure out, hey, like, if you make an ability to 01:45:27.120 |
get you a VM really, really fast, you will just get full computer for your -- for LLM, which 01:45:36.120 |
is, like, a nice win. And there's nothing you need to do special. It's just, like, a Linux 01:45:41.120 |
machine. The second -- the second part of this answer is something we don't have implemented 01:45:48.120 |
yet, but we think will be super important, and that's snapshots. With Farquaker and a few 01:45:55.120 |
other VM projects out there, you can very easily make a snapshot of the whole VM at its current 01:46:02.120 |
state, and not only file system, but also memory. And you can do it pretty fast. It takes, again, 01:46:09.120 |
I think, like, 80 milliseconds or something like that. And we think it will be super important 01:46:14.120 |
in the future, especially once LMs get more capable and cheaper. You will want to go into sort of this 01:46:21.120 |
tree-search problem where you can let many agents, versions of your, like, AI app, explore the 01:46:28.120 |
whole space. And every time -- like, if you imagine, like, a graph of a tree, every node would be 01:46:34.120 |
like a snapshot of the VM, and you can come back to it and load it again. So you can -- you bring sort of this -- 01:46:40.120 |
-- a little bit of determinism into a non-deterministic system, because you can just save it at any point, 01:46:51.120 |
and come back to it. And you can also save it and prevent your agent from doing anything you don't like. 01:46:59.120 |
And just one last note is that something like this is sort of possible with Docker containers. 01:47:09.120 |
It's just, like, it -- the technology isn't really finished. It's, like, half working, half not working. 01:47:26.120 |
If anyone has any last question, now is the time. If not, thank you for following me and coding along. 01:47:35.120 |
And if you want to ask any question, like, one-to-one to me, feel free to catch me in the halfway. 01:47:43.120 |
Or send us a message on Discord, or just email me. It's on E2B.