NVIDIA's NEW AI Workbench for AI Engineers

Today, we are going to be talking about NVIDIA's new AI Workbench. AI Workbench is a software toolkit that is designed to help AI engineers and data scientists build in GPU-enabled environments. What AI Workbench does well, in my opinion, is extract away a lot of the kind of more fiddly parts of data science and local AI engineering and allow us to just start building something that can easily be reproduced by whoever we are developing with, if anyone, and also allow us to connect to more powerful remote GPU instances and very easily just switch context between our local machine and one of those remote instances.

These are the things I think AI Workbench really seems to excel in from what I have seen of it so far. So in this video, we're going to introduce AI Workbench. I'll show you how to get set up with it, what features it comes with, where we should use it, and we'll walk through a little demo project that is actually from NVIDIA showing you how to run GPU-accelerated data processing.

Now installation with AI Workbench is pretty straightforward, but it does require a few things outside of the core AI Workbench installation. A lot of things are handled by AI Workbench, but not all. So let's just go through that process quickly. So the things that are handled by AI Workbench for us, specifically for Windows, is of course AI Workbench itself, but also Windows Subsystem Linux 2, which is handled by Workbench.

Now we do need to install separately Docker Desktop. If you are using Docker with Workbench, you can also use Podman. So in that case, you install Podman, but we do need Docker Desktop. If you're on Ubuntu, then you would just be using Docker, of course. And then the other thing we do need to install, if on Windows, is the GPU drivers.

If you're on Ubuntu, you don't need to do this. So to get started, we will head on over to NVIDIA's AI Workbench page, and we can click this download button. During installation, we're going to go through a few steps. Just follow those. Use the recommended settings unless you have some reason not to.

And then they're also going to ask you whether you want to use Podman or Docker. I chose Docker, most people use Docker. Of course, if you're using Podman, just go and use Podman. It doesn't make a difference, really, when you're actually using it. Basically within AI Workbench, every project that you build, it will be built within a container instance.

So it's just whether that instance will be within Podman or Docker. Now once we've finished that installation and we've installed Windows Subsystem Linux, all that is left is to install the GPU drivers. Now, when running AI and ML processes with GPU acceleration, our code and GPU do not interact directly.

Instead, they interact through several layers. So we go from, start with our code, like Python or C, Rust, whatever you're using. That's going to interact with CUDA, which is like a software layer, it's like a programming language from NVIDIA. That will then translate your code into instructions for your GPU drivers, which then run on your GPU.

Now, we don't need to worry about CUDA, actually AI Workbench abstracts all that away for us, which is nice, but we do need to install the GPU drivers. Now the recommended way of doing this will depend on what GPU you have. If you're on a GeForce GPU, you should use a GeForce Experience.

If you're on RTX, you should use RTX Experience. Now I'm on RTX, I'm using this Dell Mobile Precision Workstation on Windows. This comes with a NVIDIA RTX 5000 Arda GPU. So these vary a little in probably the GPUs like most of us are more familiar with, which is like 3090s, 4090s, those sort of GPUs, which are more consumer focused.

These are more focused on professional workloads, which is pretty cool. And especially that it manages to fit into this laptop, which is, I'll be honest, it's a fairly beefy laptop. It's pretty heavy, but it's still a laptop, like I can carry it around with me wherever I go. And that's just insanely useful to be able to run my AI tasks locally on a CUDA enabled GPU.

I mean, that is just incredibly, incredibly helpful. Now it comes with 16 gigabytes of GPU memory, which is nice. You're not going to be running like a huge, you know, 70 billion parameter LLM with that, but you can run small LLMs. And really the way that I would view this is more of like a local prototyping instance.

If you want to do anything heavier, if you need to do anything heavier before shifting off using AI Workbench, which you can do, to a remote like A100 instance or something a bit bigger and more powerful. So back to our installation, I'm going to go ahead and install RTX Experience and that will handle the GPU drivers installation for me.

Okay. So the first screen that we're going to see when we have gone through our installation is our locations. Now what we have here is our local machine. Our local machine is what's our local Workbench location. Ideally, we want to run on a machine that has an NVIDIA CUDA enabled GPU installed.

But also thanks to Workbench, we don't actually need that because we can set up this remote location and just use that whenever we need a GPU. Now most AI tasks do require a ton of compute. More compute than most of us would have access to locally. And these scenarios are primary use case for what Workbench does.

Workbench allows us to switch back and forth between a local dev instance and now remote GPU powered instance. Now if you do want a little more info on how to set it up, we do have a guide. So I will make sure that is linked in the comments and description of this video.

And I will cover that in a future video as well. So let's go through to our local instance. And the first thing we're going to see is this. So we can either start a new project or clone a project. Let's just have a look at starting a new project, see what it gives us when we click through to this.

Okay, so I'm just going to put some random stuff in here. It doesn't matter because I'm not actually going to create a new project. I just want to show you how it works when you create a new project. Basically I want to show you the template containers that they give you.

Okay, so we click through and we have these. So these are just a few containers that we have. So we have like a basic Python container here. This doesn't include CUDA, for example, it's very stripped down. It does include JupyterLab. We have Python CUDA 11.7, CUDA 12, CUDA 12.2.

And here we have PyTorch and this also includes CUDA 12.2. So these are like templates essentially that we can use to begin building from. It's like a foundation that we begin building our app from or whatever it is we're building. Now I'm not going to do that. I'm just going to go to clone a project so I can sort of get started and show you how everything works a little quicker.

So we do need a URL for this. So to get that, I'm going to go over to GitHub and I'm going to go to the NVIDIA organization homepage and to find the examples that they have, the Workbench examples, you can just type in Workbench and they will pop up.

Okay, so you see that we have Workbench example. All of them are called Workbench examples, so you could write this if you want to filter it down more. So you have all of these. I'm going to go over to the Rapids CUDF. So this is like CUDA accelerated data frames, like Pandas data frames, but faster with the CUDA.

So I'm going to go over here. I'm going to copy the URL and I'm going to enter it in here. It will automatically create like a default path to save everything into for me. I'm going to use that. Okay, and pretty quickly I have the project built locally. So you can go and click build if you need to.

That will build the project for you, basically set everything up ready for you to start interacting with it. But before we start interacting with it, I just want to show you a little bit around the project page that we find ourselves within right now. So this is where we can view and manage our project files, managing the environment, and of course start running JupyterLab or VS Code or whatever other apps we have installed from.

All of this is set up to run within Docker because I set Docker as my preferred container instance. This would also, it would do the same for Podman as well. If we come down to here and click on build ready, I just want to show you how this works with our Docker container.

So I'm just going to go up and I'm going to find the name of our Docker container, which is this here. So it's Rapids AI notebooks. Now if I go over to Docker desktop, which we installed earlier, and I'm just going to paste in Rapids AI notebooks and you'll see this pop up.

This is our base image that we're using in our project and you can actually click on view on hub here and it will open it in your browser in Docker hub. And yeah, you can just see some information about the container here. If you come down here, we can see that we have all these, the Rapids libraries, Rapids is like a set of software libraries that NVIDIA have been developing.

And if you come to here, we see that there are two types of these Rapids images, right? So one actually first to begin, they are based on this NVIDIA CUDA image. Then on top of that, we have this Rapids AI base, which just contains Rapids environment ready for use.

And then the one that we're using is actually this one. So we have Rapids AI notebooks. This extends the base image by adding a Jupyter lab server, some example notebooks and other dependencies to it. And that's why when you come over here, we can actually just run Jupyter lab because that has actually, Jupyter lab has been installed via our Docker image.

Okay, that's great. Now let's come over here. We can actually see some like a small description of what we were seeing before with our Docker image. So it has CUDA 12, it's Python 3.10. It contains Ubuntu 22.04, and it's like a high level view of what our container is.

And if we scroll down a little bit, we can see we have these packages. Now it actually stays here. The package managers we are supporting here is apt, conda, and pip. And if we scroll through these, these are the packages that we, or package managers that we see, conda, apt, and pip.

And we can also add more if we'd like here. So let's say I want to install PyTorch. I'm not sure what the current version of PyTorch is, but let's just say 1.2.0. And yeah, we can, I mean, PyTorch, we would go with pip or conda for that. And we can add that to our container like so.

I don't need it. So I'm not going to add it. These do get stored either within the container, or we have our requirements.txt here for pip and apt.txt for the apt packages there. Let's scroll down a little bit more, and we have our variables and secrets. Now variables are, well, they're environment variables.

These, whatever we add into here will actually be added into this variables.env file here, which is tracked by Git. So that means whenever you, you know, whatever you push, if you do push this to GitHub or GitLab or wherever else, any of the variables that you put into here will be included, which you need to be careful with, of course, you don't want to be putting, you know, secrets in there.

Obviously you put them in secrets. It just means that whoever is cloning this, your colleagues, friends, you know, random people on the internet, they will be able to use the same environment variables. Then the other thing is secrets. Secrets are not tracked by Git. So they're not going to end up wherever you push this container.

These are stored within your local workbench software. Okay. A few other things quickly to go through. So we have our mounts. That is what's a typical Docker container mount. So basically it's a place on your actual PC where data and models, whatever, can be stored so that when you shut down your project and container, that information doesn't get lost.

You have applications. So we can also add VS code to this, or you can include your own custom apps. I'm just going to use JupyterLab for now, and I'm actually going to turn it on because we're going to use that very soon. Then we can come down to hardware.

Now hardware, if you have multiple GPUs, you can allow the use of multiple GPUs here. Okay. So, I mean, that's basically everything I want to go through there. So I'm going to go over to my web browser, JupyterLab has just started up for me. So I'm going to go and open that.

All right. Cool. So we're in JupyterLab. If it didn't start up automatically, you can open it by going to localhost 10,000 or just clicking the open JupyterLab button within Workbench. Now let's have a look at what we have running here. So NVIDIA SMI, you can see that we have a CUDA installed here, and we also have our GPU is being recognized.

So you just want to confirm that you do have a CUDA version here and that you do have a GPU recognized here as well. And in the future, when you are running a process and you want to check how much memory your GPU is using, you can just run NVIDIA SMI again to check what we have in there.

Cool. So let's see how the CUDF thing works. So I'm going to go into the CUDF panels demo, and we'll just go through a little bit of this. So we have NVIDIA SMI, that's what I just showed you, just so we can see again that things are as they should be.

And we do want to confirm that CUDF is installed. So we just want to import CUDF, and this should run without any problems. Yeah. Perfect. So that looks good. What we need to do now is just download some data to sort of play around with. So we're going to be using this as NYC open data portal.

It's this parking violations issued in 2022. Fascinating data set, but I'm not ready. I'm not too fussed about the content of this. It's more a case of just seeing what we get when we compare, you know, normal pandas without GPU acceleration with CUDF, which does have GPU acceleration. And the really nice thing about this is that we literally don't change any of our code to run this.

There's like one line where we tell pandas to use the CUDF backend, and that's literally it. And it speeds things up quite a lot. So we'll see in a moment. So we'll just let that install. I'll skip ahead. Okay, cool. So we have that data set downloaded, and now we can go import pandas.

This is not using GPU acceleration yet. I just want to run this first. So in this example, notebooks from NVIDIA, they go through like some example code, and they're just showing you what is happening. That's fine. And I appreciate it. But I want to just jump straight ahead to the bit where we're timing everything and just seeing what we get there.

So we'll go through this. So right now, we're just loading the parquet file. And then we're displaying some stuff here. So we're doing like a group by, we're looking at the head, we're sorting the index, resetting the index. Okay, so the time for that total CPU time was 11.1 seconds.

Let's do some more stuff here, run those, okay, 1.57. And this one is still running is 3.89. Okay, so it's not slow. But it does take a little bit of time. Now let's try with the GPU accelerated pandas. So this line here is just going to restart our kernel, because we, or we need to restart it, we want to load cdf pandas.

And then so this here, we're loading cdf pandas, that is basically going to replace the back end with the cdf, basically GPU accelerated pandas, when we then import pandas. So I should actually, I think in the last one, where we time this, I don't think we imported pandas. No, we didn't.

So let me just move this out of this cell, create a new cell here, I'm going to import pandas, then we're going to rerun what we saw before. So last time, it took like 11 something seconds, I thought, okay, now it took less than half a second. Right? So that's, that's pretty, it's really not too bad.

Yep. So before it took 11.1 seconds. Now it takes less than half a second. So that's a pretty impressive speed up. Let's try the next one. 204 milliseconds. The last time we ran this without GPU, it was just over one and a half seconds. So pretty big speed up again.

Now let's run this final one. So this is taking 0.7 seconds. And last time we ran, it was almost four seconds. So pretty big speed up. And we didn't really do all that much, to be honest. And it's worth noting that this is, you know, we're using a relatively small data set here.

If you consider this with larger data sets, then you're going to see much bigger differences. You're going to, you're basically going to save a lot more time, which is quite nice. So if you do have, you know, that CUDA enabled GPU lying around, or, you know, you're willing to go and set up a remote instance, this is, it's worth it depending on what you're doing.

So back to our project page, I'm going to go ahead and close this. So typically what we'd be doing if we're working on a project here, we're going to go to commit. You're going to commit everything you're doing. I am not going to do that because I'm not, I'm not actually working on this project, but I'm going to come over to here.

I'm going to shut down JupyterLab. And then I would be able to just, you know, go ahead and close this. I am going to go ahead and actually delete the project because I'm not, you know, I'm not using it again. So you know, once you are done, you can delete that, save some space on your computer.

And yeah, I'm with that, I'm done. Now that is it for this introduction to AI Workbench. As you can see, it's, it's very much like a, it's like a managed solution that for me feels best suited for data scientists and sort of AI engineers that are less familiar with things like CUDA and Docker and containerization and just want to, you know, get started working on a project and, you know, pretty easily switch across to like remote GPU instances without needing to worry about setting all that up because it can be kind of annoying, especially if it's the sort of thing that is, you know, relatively new to you as well.

So I think this is actually, it's a pretty good solution for a lot of people. I think maybe if you're a, you know, you're a developer and you're working with a development team on like an AI project, it probably wouldn't be for you unless you're doing some like quick prototyping but within the space of more sort of data science and in some cases AI engineering, I think it's probably pretty useful.

Okay, so that is it for this video on AI Workbench. I hope all this has been useful and interesting. So thank you very much for watching and I will see you again in the next one, bye.

NVIDIA's NEW AI Workbench for AI Engineers

Chapters

Transcript