back to index

AWS Deep Learning setup


Chapters

0:0 Intro
0:45 Request service limit increase
2:10 Python installation
3:25 cygwin installation
4:50 security keys
5:10 create user
6:20 configure AWS
7:20 download script
7:55 run script
9:50 Nvidia SMI
11:20 Testing

Whisper Transcript | Transcript Only Page

00:00:00.000 | This video explains how you can get set up with a GPU-based server on Amazon Web Services.
00:00:08.000 | The server is already for you to start running deep learning models, and we will be using it in the Data Institute Deep Learning Certificate.
00:00:17.000 | In order to use AWS, of course, you will need to sign up if you haven't already.
00:00:23.000 | You can go to aws.amazon.com and click on "Create an AWS Account".
00:00:32.000 | And then just fill in the information as it's requested from you.
00:00:36.000 | Once you've done that, you will find yourself on a page that looks something like this.
00:00:42.000 | This is your Amazon Web Services console.
00:00:46.000 | Unfortunately, AWS does not allow you to create GPU-based servers without requesting special access.
00:00:55.000 | So the next thing you'll need to do is to go to this address, aws.amazon.com, contact us, EC2 request, and ask for a service limit increase.
00:01:08.000 | For region, choose US West Oregon, and for instance type, choose G2 to X large.
00:01:25.000 | And say that I want a new limit value of 1, because by default you get 0.
00:01:31.000 | And then in the use case description, you can explain what you're using this for.
00:01:37.000 | Choose your contact method and press submit, and they should get back to you within 24 hours.
00:01:43.000 | If you've had an AWS account for a while and have been paying your bills,
00:01:47.000 | you should find that you can skip this step because you already have access.
00:01:52.000 | Once you have access through AWS, we can get our instance up and running.
00:01:59.000 | Everything that we do will be done through the console,
00:02:03.000 | and specifically we'll be using the AWS command line interface, which is a Python-based interface.
00:02:11.000 | For all kinds of reasons, I very strongly suggest that you use the Python distribution called Anaconda,
00:02:17.000 | regardless of whether you're on Windows or Mac or Linux.
00:02:22.000 | In our course, I'm going to be assuming that you have Anaconda installed,
00:02:26.000 | and specifically that you have it installed under your user account.
00:02:30.000 | I don't suggest trying to change the root Python or the main Python if you have one as a Mac or a Linux user.
00:02:39.000 | Installing Anaconda is very simple.
00:02:41.000 | You just follow the instructions for each of the different operating systems,
00:02:46.000 | and choose the Python 2.7 version and the 64-bit installer.
00:02:52.000 | Once you've got that set up, the only other thing you need to do if you're using Windows
00:02:58.000 | is to make sure that you have a Bash shell installed.
00:03:02.000 | Specifically, I suggest that you use Sequin.
00:03:06.000 | The reason that you need to do this is that some of the scripts we will be using are written in Bash,
00:03:12.000 | rather than PowerShell, which is the default in Windows.
00:03:17.000 | So you'll need to install this.
00:03:19.000 | Installing it is very simple. You just go to the Sequin homepage,
00:03:22.000 | click on Setup x86-64, and it will pop up a screen that looks like this.
00:03:33.000 | I suggest you change Category to 4, and just make sure that you have WGet selected.
00:03:40.000 | So if you go to WGet, and just to make sure that it doesn't say "Skip" or "I'd install".
00:03:47.000 | Once you've done that, you can click Next, and finish off that wizard, and that will install Sequin.
00:03:55.000 | Once you've done that, you can easily run Sequin by just typing Sequin in your Start Menu.
00:04:02.000 | If you have Mac or Linux, you'll already of course have a terminal with Bash installed,
00:04:09.000 | and so you can just go straight to there now.
00:04:11.000 | So here we are at our console. We're running Bash, and in order to install the AWS command line tools,
00:04:20.000 | you simply type pip install awscli.
00:04:27.000 | In my case, you can see it's already installed, so it's ready to use.
00:04:32.000 | The AWS command line interface creates a program called AWS,
00:04:41.000 | and in order to use AWS, you first of all need to configure it with your security keys.
00:04:47.000 | So the next thing we're going to do is to set up our security keys.
00:04:52.000 | So go to the AWS console. You can always get there by simply typing console.aws.amazon.com.
00:05:02.000 | And specifically, we're going to need to create a user for ourselves.
00:05:05.000 | We do this in this section called Identity and Access Management.
00:05:11.000 | And here we are in the IAM, Identity and Access Management section, where we can create a user for ourselves.
00:05:17.000 | So we say create new user. We enter the name of the user we want to create.
00:05:24.000 | Leave this tick to generate an access key for each user, and say create.
00:05:30.000 | And very importantly, now say show user security credentials,
00:05:35.000 | and copy and paste this somewhere safe, because you're going to need them,
00:05:39.000 | and you won't be able to get them again later.
00:05:43.000 | We can now close that, and that's created our new user.
00:05:48.000 | We now need to make this user an administrator.
00:05:51.000 | So I click on the user, and I choose permissions,
00:06:01.000 | and I say administrator access, which is the first one here, and I say attach policy.
00:06:08.000 | Okay, we have now made this user an administrator, and we're ready to go ahead
00:06:13.000 | and use this user to create our GPU server.
00:06:21.000 | In order to do that, we first of all configure AWS by typing AWS configure.
00:06:30.000 | As you can see, it asks us for our access key ID, and here it is.
00:06:34.000 | This is the thing that we just selected. So paste that into access key ID,
00:06:38.000 | and press enter, and then secret access key.
00:06:43.000 | Copy and paste that whole line as well, and press enter.
00:06:48.000 | For default region name, choose us-west-2.
00:06:55.000 | And for default output format, choose text.
00:07:00.000 | AWS is now configured, and we only have one more step left to go.
00:07:07.000 | We need to use a script that we have set up for you that will create
00:07:13.000 | and set up your new deep learning server.
00:07:17.000 | So let's go and get that script.
00:07:20.000 | We can get it from here, www.platform.ai/files,
00:07:26.000 | and it's called setupp2.sh. The P2 instances are the brand new type of AWS
00:07:33.000 | instances which are particularly good for deep learning.
00:07:39.000 | So we can use wget to download that by typing wget and pasting the path.
00:07:48.000 | There it is, and you'll see that we now have that script.
00:07:56.000 | Let's now go ahead and use that script to create our deep learning server.
00:08:00.000 | So we just type bash, setupp2.sh, hit enter,
00:08:06.000 | and then we have to wait a minute or so as it runs through all of the configuration steps,
00:08:12.000 | asking Amazon Web Services to create each piece of this puzzle.
00:08:18.000 | It's going to set up a virtual network for us.
00:08:23.000 | It is going to set up a security key for us.
00:08:27.000 | It is going to set up an IP address for us.
00:08:30.000 | And then it's going to connect all that up to our requested server.
00:08:37.000 | AWS calls it an instance.
00:08:39.000 | So you can see it's setting up the instance now and waiting for it to start.
00:08:43.000 | The permission denied error I'm getting on my AWS key there
00:08:47.000 | is simply because I have run this script before,
00:08:50.000 | so it's going to regenerate an SSH key when I already have one.
00:08:56.000 | So if you get that error message, don't worry,
00:08:58.000 | it just means that you've run this script once before.
00:09:04.000 | So the script has finished running,
00:09:06.000 | and you can see it's created a number of things,
00:09:09.000 | and let us know the details to each of them as well.
00:09:13.000 | I strongly suggest that you copy and paste these somewhere convenient.
00:09:22.000 | It will be useful to use them again later.
00:09:25.000 | But most importantly, let's try now connecting to our new server.
00:09:33.000 | So I'm going to copy the connect line here,
00:09:37.000 | paste it into our console,
00:09:44.000 | type yes to say we are happy with the authenticity of the host,
00:09:49.000 | and we are in.
00:09:52.000 | Now that we've logged in,
00:09:54.000 | we can check that the GPU is working correctly by typing nvidia-smi.
00:10:01.000 | This is a handy utility which just checks exactly what's going on.
00:10:05.000 | It shows us that indeed the nvidia driver is running successfully.
00:10:10.000 | It has found the graphics card,
00:10:13.000 | shows us how much GPU memory we're using,
00:10:16.000 | and whether anything is currently using that GPU.
00:10:20.000 | So that's all looking pretty good.
00:10:23.000 | The only other thing that I suggest you do is I made a slight mistake
00:10:27.000 | when I first set up this AMI,
00:10:31.000 | which is I had a file left over which I should have deleted.
00:10:36.000 | So I suggest you delete it by typing sudo rm.bash_history,
00:10:43.000 | and the reason for that is that otherwise it won't save your history,
00:10:46.000 | which will be pretty inconvenient.
00:10:49.000 | So the last thing I'm going to do is just make sure
00:10:51.000 | that we can actually run a notebook successfully.
00:10:54.000 | So we'll type jupyter notebook.
00:11:03.000 | And now the notebook is up and running.
00:11:05.000 | You can see that it is running at port 8888.
00:11:10.000 | So let's test it.
00:11:12.000 | And we were told the URL of our instance earlier on,
00:11:17.000 | and hopefully you copied it like I suggested.
00:11:20.000 | So now copy that URL, go to your web browser,
00:11:24.000 | and go to that URL, and then colon 8888,
00:11:29.000 | since that's where we've got our notebook running.
00:11:33.000 | As you can see, it's password protected,
00:11:36.000 | and the default password that I've put in is dl_course.
00:11:44.000 | So type dl_course and log in.
00:11:54.000 | Let's go to the notebooks directory, nbs,
00:11:58.000 | and let's create a notebook just to make sure it all works.
00:12:01.000 | Okay, new python condoroute.
00:12:09.000 | Let's try doing a simple calculation.
00:12:12.000 | I believe that is accurate.
00:12:14.000 | Let's try importing theano,
00:12:17.000 | which is the underlying library we're using for accessing the GPU.
00:12:27.000 | Great, that's working well.
00:12:29.000 | And then secondly, let's make sure we can use Keras,
00:12:32.000 | which is a wrapper on top of theano
00:12:34.000 | that we'll be using throughout the course.
00:12:38.000 | That's working well as well.
00:12:40.000 | Okay, well it looks like everything is set up correctly,
00:12:43.000 | and you're ready to start using it.
00:12:45.000 | When you're done, don't forget to shut down your instance
00:12:50.000 | so that you don't get charged for it.
00:12:52.000 | You can do that by going back to your AWS console,
00:12:56.000 | choosing the instance that you want to shut down,
00:13:01.000 | and choose instance state, stop.
00:13:05.000 | Later on, I'll show you how to do these things through the command line
00:13:09.000 | to save Houston time.