back to indexLive coding 3
Chapters
0:0 Catch up Questions from last session
6:11 `settings.ini` and fastbook setup (more advanced)
8:19 The `$PATH` environment variable
12:22 Creating and using a conda environment
18:27 Creating a Paperspace notebook
33:12 The python debugger
43:8 Installing pip packages into your home directory
49:21 Persistent storage, mounted drives, and symlinks
56:27 Paperspace have different python environments by default
69:34 Creating a Paperspace notebook with everything set up automatically
76:35 Copying SSH keys to Paperspace to communicate with github
00:00:00.000 |
All right. Does anybody have anything? Yeah, they wanted to ask about or talk about or 00:00:10.600 |
anything else before I had a quick question. Yeah, it's quick. Um, when we're talking yesterday 00:00:18.120 |
and I asked you about the environments, you seem to feel very strongly that you should 00:00:21.600 |
work in the base environment and I've been rolling it over in my head. And I'm just when 00:00:27.000 |
I think about the mistakes that I've made and how I've screwed up environments and gotten 00:00:29.980 |
conflicts and stuff like that, I was wondering why you feel so strongly about that. 00:00:34.720 |
Sure. I mean, we'll talk about it more when we kind of get to environments, but because 00:00:38.080 |
we haven't discussed them yet, but, you know, briefly, you know, environments are basically 00:00:44.440 |
separate folders with separate installations of Python and Python libraries and so forth. 00:00:52.160 |
They're often used for kind of keeping separate projects separated with different sets of 00:01:01.440 |
kind of dependencies or versions of Python or whatever. And they certainly have a role 00:01:07.840 |
to play for advanced users. I almost never use them. I mean, very, very occasionally 00:01:15.560 |
use them. But my feeling is the most important thing is to be able to rapidly iterate and 00:01:23.560 |
experiment. And I kind of want my projects to live together as a as a whole, as a bunch 00:01:30.840 |
of things which all help each other and come together. So I don't like the idea of like, 00:01:36.080 |
oh, I'm working on this project now. I go over there and everything's kind of new, you 00:01:42.960 |
know. So, instead, I really like to get very fast and very good at just quickly just going 00:01:50.640 |
RM minus RF mini forge and it's gone and run set up condor.sh and it's back and have a single 00:01:57.560 |
script that if I need one, I don't even need a script, I'll just go mambor install -c fast 00:02:02.880 |
chan fastbook and that installs everything that I need and I'll go. So I kind of like 00:02:10.400 |
never want to be in a situation where anything on my computer is I don't really like it's 00:02:16.240 |
working, but I don't know how I got to a point that it's working and I don't want to touch 00:02:19.720 |
anything lest I undo that, you know. So I'm more in the kind of chaos monkey side of like 00:02:27.920 |
explode things from time to time intentionally and be really good at putting them back to 00:02:31.640 |
where they were, I guess. Nowadays, I never have problems, basically, with dependencies 00:02:41.280 |
or weird things going on in Python or whatever because I just can type, you know, like probably 00:02:46.560 |
every few weeks, I'll just throw it away and install it just to try something out for teaching 00:02:52.360 |
or whatever. I always feel fine. This morning, I did use an environment because I specifically 00:03:00.320 |
wanted to test something on a different version of Python and I wanted to check that it would 00:03:05.000 |
install into somebody's fresh new environment and so I used it for that. I think it's useful 00:03:13.840 |
if you are like installing some library where they've done a poor job of keeping their dependencies 00:03:22.120 |
up to date. So you need like Python 3.6 and sentence p1.8 and I don't know, old versions 00:03:28.360 |
of things in which case you want to be able to go and get all these exact versions of 00:03:34.160 |
things. But my approach is to, for my projects, is to not pin versions, not pin dependencies. 00:03:44.400 |
I want anybody to be able to install my work into whatever they're doing and work with 00:03:50.520 |
all their other programs that they're running and libraries that they're using without anything 00:03:54.800 |
getting messed up. Unfortunately, not everybody works that way, but that's how I, you know, 00:04:02.680 |
try to make other people's life easier and so therefore any programs you use from me, 00:04:07.520 |
you'll be able to install into your base environment without messing anything up or in store into 00:04:11.160 |
any environment without messing things up. >> And when -- sorry, just a quick follow-up. 00:04:17.920 |
If you're installing into like a new computer or whatever, would you use -- would you install 00:04:25.960 |
fastbook or would you install fastai? >> It depends. I would just probably install 00:04:33.480 |
fastbook because fastbook installs fastai, which installs NumPy, pandas, Matplotlib. It 00:04:41.160 |
also installs transformers, data sets, sentence piece. I think everything except sentence 00:04:49.960 |
piece. There's no reason it shouldn't install sentence piece. >> It didn't yesterday. 00:04:58.400 |
>> Yeah. I was just remembering that. >> So, Jeremy? 00:05:03.760 |
>> Yeah. >> So, if you're blowing it away and you're 00:05:07.560 |
basically using a new OS install as like people do with environments, how are you keeping 00:05:15.000 |
track of your things like RSA keys, et cetera? How are you not blowing those away? 00:05:18.800 |
>> Those are not part of a contour environment. So, that's fine. They sit there in my home 00:05:25.080 |
directory. It's just that many-forged directory or anaconda directory depending on what you're 00:05:30.600 |
using. I just delete that. >> Cool. >> Jeremy, you talked about uninstalling 00:05:37.320 |
always in this environment. Yesterday I was trying to -- I messed up one of the dependencies. 00:05:43.840 |
What are the steps for uninstalling usually? >> Go to your home directory and type RM minus 00:05:49.880 |
RF manberforge. >> Only that is required. Okay. I tried that. 00:05:54.920 |
>> And then close your terminal and reopen it. I remember the other day, Archie didn't 00:05:59.040 |
do that step and so I didn't install properly. I'll show you a quick trick. Yeah. Sorry, 00:06:13.720 |
this is a little more advanced than normal, but that's okay. So, this is slightly confusing, 00:06:26.360 |
but the fast book pi pi and condo installer actually comes from a repo called cost 20. 00:06:34.960 |
And it's here. It doesn't really contain any code. It contains a few utils, but that's 00:06:45.280 |
actually like search images being -- this has got nothing to do with what we're using 00:06:53.720 |
it for, something that returns an image of a cat. But actually, the key thing is it's 00:07:02.080 |
got a settings dot any file which contains a list of requirements. And so, if I now put 00:07:10.720 |
this up on pi pi and condor, then if I say condor install or pip install fast book, then 00:07:18.480 |
that's one quick way of just getting all these. Or you could create a tiny little script that 00:07:24.880 |
goes, you know, condor install and then with these things in it. But, yeah, you want some 00:07:31.360 |
way to get yourself into the basics set up instantly. Oh, here we are. This is why it 00:07:39.000 |
didn't give me sentence piece. Sentence piece only comes with pip. And that's because when 00:07:42.400 |
I set this up, I didn't have fast Chan. And so I didn't have sentence piece in condor. 00:07:49.080 |
>> Jeremy, I think you've got minimum Python there, 3.6, but I think fast AI repo has 3.7 00:07:57.040 |
in it. >> Yeah, yeah, that's right. Which I suspect 00:08:00.160 |
it probably overrides. But, yeah, so here's a good use of the GitHub GUI, right? I want 00:08:07.440 |
to just change it while I'm looking at it. And we're done. Cool. Yeah. Okay. So, that's 00:08:24.160 |
a good question. And, you know, we'll -- at some point, I'm sure we'll need to create 00:08:28.480 |
an environment for something. And we'll talk more about that. I guess, like, maybe something 00:08:33.920 |
else just -- well, since we are, as I said, this is a bit more advanced. And people can 00:08:37.480 |
totally skip this. But just -- I mean, it's probably worth understanding what condor/mamba 00:08:49.280 |
is and how it works, right? So, you know, remember the other day, I typed which Python. 00:09:00.720 |
And I saw that I'm getting -- that Python is coming from this directory. So, like, one 00:09:05.120 |
obvious question is, well, how does -- why is it coming from this directory? And the 00:09:11.480 |
reason why it's coming from this directory is -- let me just open up this a bit more 00:09:17.840 |
so I can see more people. There we go. Is that Linux -- I mean, I shouldn't say Linux, 00:09:26.360 |
you know, Bash and pretty much all shells. They use the concept of something called the 00:09:30.600 |
path. And the path is the list of places to look for programs to run. And the path lives 00:09:37.720 |
in something called an environment variable. And an environment variable is just like a 00:09:41.160 |
Python variable, but it's a variable that lives in your shell. And you can -- you can 00:09:46.160 |
print them out. So, instead of print in your shell, you type echo. And then an environment 00:09:51.400 |
variable -- normally if I just say echo something, it just prints it, right? So, if I want to 00:09:58.160 |
echo the contents of an environment variable, I have to put dollar before a dollar means 00:10:02.280 |
this is a variable that I'm printing. And so, the variable path -- there it is, right? 00:10:08.480 |
And so, you can see that this is a string -- a colon-separated string. And in my colon-separated 00:10:14.680 |
string, there's something which is home jhoward-mamba-forge-bin. And so, that directory, if we take a look 00:10:22.480 |
at it, contains lots of programs. And one of those programs is PyPyPython. Okay? So, that's 00:10:38.080 |
why it is when I type Python that -- oops, I didn't mean to do that. When I type Python, 00:10:49.800 |
that's the Python that it runs. So, here's a little trick. I want to type which Python. 00:10:56.680 |
And I'm so lazy, I couldn't even bother typing Python. So, if you remember, double exclamation 00:11:01.320 |
mark means the previous command. So, that's going to be which Python. So, it's worth looking 00:11:07.200 |
and seeing, like, well, what is this mamba-forge directory? So, the mamba-forge directory, for 00:11:14.560 |
those of you that have kind of seen UNIX-type directories before, it contains a bin directory 00:11:23.120 |
and an et cetera directory and a lib directory. And this basically looks very similar to my 00:11:30.760 |
root directory. And so, basically, you know, a condor or mamba-forge or whatever root directory 00:11:40.640 |
is kind of a copy of Linux or even actually a Mac root directory. Contains very similar 00:11:47.400 |
things, et cetera, user, and so forth. And what happens is that the thing that it puts 00:11:57.800 |
into our bashrc, the script that automatically gets run, this thing here. It runs a little 00:12:05.880 |
shell script that sets some environment variables. And one of the environment variables it sets, 00:12:10.720 |
for example, is the path environment variable. And it adds this to path. And it does something 00:12:16.760 |
similar to kind of make all the libraries work as well. And so, we mentioned how you 00:12:23.880 |
can create a totally separate, you know, environment, a totally separate place you can work that 00:12:31.440 |
has its own copy of Python and libraries and stuff. The way you do that is you go mamba-create-n, 00:12:40.140 |
give it a name, and then say, what do you want to have in it? So, let's say, OK, I want 00:12:44.320 |
to have Python in it. I don't normally like to have the latest Python. So, let's get something 00:12:55.680 |
before 3.10. And I also want fast-forward in it. So, that's going to create, so you can 00:13:03.760 |
go mamba-create or condor-create. I actually already have that because I used it this morning, 00:13:10.160 |
as I mentioned. So, I'll remove that automatically and create a new one. And so, that's going 00:13:17.560 |
to set up a new environment, which we will take a look at. So, currently, what it's doing 00:13:27.080 |
is it's downloading from the internet a list of all of the condor packages that are available 00:13:33.840 |
from a channel called condor-forge, which is the main channel that mamba-forge uses. And 00:13:40.480 |
it says, OK, I'm going to install Python and fast-core. To install those things, I'm going 00:13:44.920 |
to need these other things as well. That sounds fine. You'll see it's cached. So, basically, 00:13:52.120 |
one of the nice things about mamba and condor is that it kind of saves the archives that 00:13:58.200 |
you've downloaded. It doesn't have to redownload them. So, as it now says, you can activate 00:14:03.560 |
this environment by typing mamba-activate-temp or condor-activate-temp. So, that's changed 00:14:10.320 |
my shell. If I now say which Python, it's getting it from a new place. And it's getting 00:14:19.000 |
it from the same place as before, home jhoward-mamba-forge. But it's now getting it from mvstemp. And that's 00:14:27.880 |
because this mamba-forge directory has a directory called envs. And that envs directory is a 00:14:35.080 |
folder that contains each of those environments. And it's really interesting to see what's 00:14:40.040 |
in them. Because, look, it's yet another copy of the kind of things you would see in the 00:14:46.200 |
root of a Linux installation. So, that's why it works, right? It's basically yet another 00:14:53.240 |
copy. So, for example, we'll see that in mvstemp-bin, here's another copy of Python. 00:15:00.360 |
So, if I type Python, it's running that Python. And it's got its own 00:15:05.880 |
set of libraries. So, it's using those libraries. So, it's -- yeah, it's really neat. And you can 00:15:14.600 |
install compilers. You can install, you know, any binaries you like. You can install Rust, 00:15:19.960 |
you know, a separate copy of Jupyter, whatever. By the way, something that's quite neat, not as 00:15:26.520 |
important as it used to be, but these are actually using something called hard links to create these. 00:15:32.680 |
So, they're actually not even separate copies. So, it's like not even using disk space. 00:15:37.400 |
So, yeah, the whole thing is really quite nifty. So, yeah, so, basically, when you go activate, 00:15:43.320 |
it's -- in fact, let's take a look at my path. It changes my path, see? So, now, 00:15:49.960 |
this comes first. >> Maybe we could look at hard links. I find hard links quite useful for myself. 00:15:57.080 |
When I have a lot of data in a folder, and I want to run something on this data from another place, 00:16:03.160 |
I just create a hard link. >> Do you create sim links or hard links? 00:16:07.480 |
Because normally you'd use sim links for that. >> Yes, that's the word. That's the wrong 00:16:12.280 |
expression. I create soft links. >> Sim links, yeah. Yeah, we will get to sim links. 00:16:18.280 |
Let's wait until we kind of need them, maybe, yeah. 00:16:21.960 |
Okay. So, to go back to activating the base directory, I just type "con" to activate, 00:16:28.760 |
and now I'm back in my main environment. So, yeah, hopefully that explains a little bit about 00:16:38.360 |
what environments -- and why you might use them. There's a certain way of developing software, 00:16:53.240 |
which is particularly common in the JavaScript world, where you freeze the exact versions of 00:17:03.480 |
everything at a particular point in time, and so you end up with things like -- well, in the Ruby 00:17:11.320 |
world, you end up with a gem.block file. In the Python world, you end up with a requirements.txt 00:17:16.920 |
file. In the JavaScript world, you end up with your packages.json file. You know, this is something 00:17:24.600 |
that I would strongly recommend trying to avoid as a data scientist when you freeze particular 00:17:31.720 |
version numbers. It makes it almost impossible to mix and match things from different places, 00:17:39.160 |
you know, this library from here and this thing from here, and, you know, you end up going into 00:17:43.240 |
this huge complex ecosystem of Docker containers and, you know, trying to find ways to make that 00:17:53.880 |
all work can get quite overwhelming, and you can actually entirely avoid it by just, you know, 00:18:03.800 |
using a single base environment and keeping your libraries up to date and having good tests and 00:18:10.360 |
knowing when a release has broken something and so forth. You know, it's not always the way, 00:18:15.080 |
but this is my suggestion for, you know, rapid iteration data science is to do things this 00:18:23.160 |
particular way. All right. So then we've got our own computer running, and it's nice to 00:18:37.320 |
be able to use Python on your own computer because, you know, you can rip it out of a 00:18:42.120 |
laptop anywhere, you don't have to be on the internet, you don't have to start a server 00:18:47.240 |
somewhere, it's nice to be able to quickly play with things. And, you know, I think, like, ideally, 00:18:54.280 |
a large amount of the time you're not using the GPU because a large amount of the time hopefully 00:18:58.760 |
you're, like, exploring results or you're testing out things in really small samples that don't need 00:19:03.720 |
a GPU or, you know, hopefully you can do a lot of stuff on your computer. At some point, you need a 00:19:10.360 |
GPU. And my view is that you should try to use a GPU in a way that feels as much like your computer 00:19:22.440 |
as possible, but doesn't cost you much, if any, money. So, at the moment, my view is by far the 00:19:33.400 |
best option for that is paper space. Paper space is actually a company that have a few different 00:19:39.640 |
products and specifically it's a product called Gradient. Gradient is, in fact, specifically it's 00:19:49.240 |
Gradient Notebooks. So, Gradient Notebooks is basically something where you can get a free GPU 00:19:59.160 |
server, which behaves a lot like what we've just been working with, you know, you'll get a terminal 00:20:14.680 |
Okay. So, paper space has this concept of projects. I have no idea what they're useful 00:20:35.240 |
for. I just have one project, so I'll just go ahead and click on it. They're just, they're the things 00:20:40.840 |
that contain your, they call them notebooks, but these are basically servers, right? These are some 00:20:45.240 |
servers. Now, I, there's a few options for, like, paying their money. And if you can afford it, 00:20:59.640 |
you know, this is such a good deal, the $8 a month. Not only because, as you'll see, you get 00:21:04.920 |
some pretty good GPU options, and you can keep things private, but you also get more persistent 00:21:11.160 |
storage. So, that means you can store things between sessions. Now, the reason this is really 00:21:18.440 |
important is because these aren't actually my servers. Paper space has not put aside servers 00:21:24.200 |
for me to use. These are kind of small little saved snapshots, basically. And it's going to 00:21:36.840 |
kind of create a new computer each time I fire one of these up. And so, it's really nice that, 00:21:47.000 |
as you go from, you know, instance to instance to be able to access the same files each time. 00:21:53.720 |
So, let's start from scratch, because that's what we're doing. Okay, so, 00:21:59.320 |
it says select a runtime. Basically, what this is going to do is it's just going to pre-populate 00:22:05.480 |
your server with some files. And so, if you choose the fast AI one, then you'll have the main stuff, 00:22:13.240 |
you know, basically everything you need for the book pre-installed. So, let's do that. 00:22:18.760 |
And so, as you can see, there's various free options and various paid options. 00:22:29.320 |
So, I'll use there. So, basically, you know, important things to know about is how big is the 00:22:39.000 |
GPU? These are all pretty good. 8 or 16 is great. 16 is obviously better. And then, how fast is it? 00:22:48.760 |
P, that'll probably be a Pascal card. So, that's a couple of generations old. So, 00:22:54.680 |
it's like quite a lot slower than modern cards. RTX is totally up-to-date card. 00:22:59.560 |
But this one's got a lot more GPU. So, I'm just going to pick this one. 00:23:04.680 |
Six hours. So, it's going to, you know, if you're paying for it, make sure you've got auto shutdown 00:23:11.800 |
set to something sane. Otherwise, you'll end up paying for it for a long time. 00:23:17.400 |
You can easily share notebooks with other people by turning public access on, which is by default. 00:23:23.000 |
There's a few advanced options here. I don't think we particularly need to touch them, to be honest. 00:23:30.360 |
One thing I'm just going to note now is that it's going to run a command called run.sh. So, 00:23:37.720 |
just note that down because we're going to check it out later. And you'll also see it's actually 00:23:42.280 |
going to clone a git repo. So, I mean, one thing you could do is if you've got a fork of fastbook, 00:23:50.840 |
then replace fast.ai with your username and you're going to get your forked version. 00:23:55.640 |
Okay. So, I'll start. So, yeah. So, I don't know. I find this a bit confusing that it says 00:24:03.160 |
notebook. It's not a notebook, right? It's starting a server for us. And that server is going to run 00:24:11.240 |
Jupyter Notebook automatically. So, the thing that appears here is the paper space GUI. 00:24:23.560 |
I don't love it, honestly. So, I don't really use it very much. The one thing that you do 00:24:36.760 |
particularly want it for, though, is to be able to stop your server when you're finished. 00:24:39.640 |
Especially if you're paying for it. I mean, you should do it anyway because there's no points 00:24:44.840 |
using their server hours. So, what I'm going to do is I'm going to just copy this URL and create a 00:24:51.000 |
second tab and paste it just so that I've got two versions of that. So, this one here is just going 00:24:56.600 |
to be sitting here and I can go back to it and click stop later. In fact, when I close this tab, 00:25:01.720 |
it will remind me that I have to click stop. So, this is a good way to not accidentally 00:25:09.080 |
forget to stop your server. That auto shutdown, it happens if you're inactive or that would 00:25:18.360 |
shut down regardless. That shuts down regardless. Because they don't really know 00:25:23.800 |
if you're doing things. They don't really have any telemetry or anything. Oh, by the way, 00:25:30.760 |
this five hours seems to be truncated down. So, it's actually 5.9 hours. 00:25:38.680 |
That's just something I noticed. It's a bit of a bug, I guess. 00:25:41.080 |
Yeah, so, in five hours' time, it's going to shut down regardless. 00:25:45.320 |
So, the first thing I do actually is I click this button, which gives us proper JupyterLab. 00:25:54.360 |
And then I don't have to use the slightly crummy GUI anymore. And this is also nice because now 00:26:01.320 |
we're going to be using something that's just like what we have on our computer, which is the goal. 00:26:07.320 |
Okay. So, here's JupyterLab. And you can see that the book is here. 00:26:16.760 |
And yeah, this is basically the Git repo that was automatically filled in for us as being 00:26:32.040 |
cloned into here. Just what I'm going to do is start a copy of an old machine as well. 00:26:46.200 |
Because I want to access some files from there. 00:26:57.800 |
Start machine. Okay. So, I guess to start with, we could go into clean, open up mnest basics. 00:27:26.680 |
So, let's see how much they've got installed, see if it's all ready to go. Let's try running this cell. 00:27:43.960 |
Well, there we go. It looks like it's got everything. Let's try running this cell. 00:27:49.720 |
Nice. Okay. So, it's basically got fastbook installed and sentence piece installed. 00:27:56.840 |
So, that's good. Sorry, Jeremy. Are we checking JupyterLab or are we checking the paper space? 00:28:07.640 |
So, just to remind you, I click on this button and that gives us JupyterLab in paper space. 00:28:16.200 |
Thank you. Sorry, I missed that. No, no problem. It's easy to miss things. Ask anytime. 00:28:23.880 |
So, one thing that is actually I find kind of confusing about JupyterLab is it has its own set 00:28:32.360 |
of tabs in its own interface and it kind of replicates things like that could be in a browser. 00:28:37.240 |
So, in a lot of ways, I kind of prefer the old version of Jupyter, Jupyter classic, which you 00:28:42.440 |
can always switch to. But, you know, you can get used to it. And one thing that helps a lot is if 00:28:50.120 |
you just full screen this, right, and kind of know the keyboard shortcuts. So, control shift 00:28:57.960 |
left and right square brackets switch between tabs. And that's the main one to know. And control B 00:29:04.440 |
turns on and off the sidebar. So, this way, at least, you can, like, get a nice, 00:29:09.640 |
you know, good reason for screen, particularly when I click terminal. So, if I click terminal 00:29:15.000 |
here, that's not bad, right? I've got plenty of room to see my terminal. So, that's nice. Okay. 00:29:22.040 |
So, I don't -- >> Sorry, Jeremy. Just on the bottom there, 00:29:33.880 |
if you want to get rid of those tabs for any reason, there's that little switch that says 00:29:37.480 |
simple. That will hide those tabs. >> Yeah, that actually gets rid of the tabs as well, 00:29:43.480 |
which I'm actually using the tabs. But what you can do is you can go remove status bar. 00:29:48.280 |
It gives you a bit more room as well. So, yeah, now we're actually doing pretty well. 00:29:53.800 |
And one particularly nice thing in Jupyter, by the way, is it actually has a graphical debugger, 00:29:58.440 |
which, you know, so if we go for I in range 10 print I, and then we turn on the debugger with 00:30:11.560 |
this little button here. So, we can put a breakpoint here on and off by just clicking. 00:30:27.320 |
And so, now, if I run this cell, you'll see that it's got a breakpoint, which is very nice. And 00:30:40.200 |
we can -- got a lot of things in here, doesn't it? 00:31:11.160 |
What? >> That one. Okay. So, you can see, like, here's I. And so, if I now step through this, 00:31:24.760 |
there's a better way to just show what we want. 00:31:31.720 |
Step. Okay. So, it's kind of like -- yeah, it's -- that's kind of a useful thing to have, I think. 00:31:59.400 |
Yeah, I guess this would probably be easier if this is actually probably a really good place to 00:32:07.480 |
not use import star, because I don't see an obvious way to only add 00:32:13.080 |
variables we want to the debugger. So, let's restart the kernel. 00:32:23.800 |
Okay. And then run this cell. There we go. That's much better. So, now, we can just see 00:32:42.760 |
that variable changing. You might be wondering why it is that I clearly am not very competent 00:32:53.320 |
using the graphical debugger. And that's because I don't use it myself, because I actually really 00:33:00.440 |
like the non-graphical debugger, which I'll quickly show you. The non-graphical debugger 00:33:10.280 |
you can use anywhere. Jupyter doesn't have to be Jupyter. It can be in a terminal or whatever. 00:33:15.560 |
But inside Jupyter, if you just put percent debug at the top of your cell, it runs the regular Python 00:33:27.480 |
debugger, which is a -- it's a repo, print debugger. And you can type H for help to find out what you 00:33:36.840 |
can do. And basically, you can type just the first letter of any of these if they're unique by first 00:33:44.760 |
letter. You can see, actually, the ones which have the first letter. So, C is short for continue, H 00:33:51.560 |
is short for help, and is short for next, P is short for print. So, the single-letter ones are 00:33:55.720 |
short for, like, the ones that you use all the time. And I always use the single letters, because, 00:34:00.920 |
you know, why wouldn't you? So, for example, L -- actually, I'm not really in a file, so that 00:34:11.880 |
won't work. So, let's try, for example, we can do N for next, so that just N goes to the next line. 00:34:18.520 |
So, here we are. So, we've now gone into the, you know, the code that we have in our cell. 00:34:25.400 |
So, we should now be able to -- oh, next again. This is really weird. Why is this not -- 00:34:37.880 |
Must be something to do with -- I wonder if this is some weird gputter lab thing. 00:34:43.320 |
Yeah. Okay. I think what happened was that, because I had used the graphical debugger, 00:34:55.800 |
it broke the normal debugger. Okay. So, let's start again. So, I hit N for next, 00:35:04.200 |
and that's still not really working. Okay. No worries. Let's switch to regular 00:35:51.320 |
Okay. Percent debug for i in range 10, print i. 00:36:19.160 |
What if I put this in a function? Oh, okay. I don't know. I pretty much always debug things 00:36:40.920 |
that are in functions. So, that's what's going on. Okay. So, I created a little function. 00:36:47.480 |
I put percent debug. I called the function. And then the first thing I did is I typed S. 00:36:53.640 |
S steps into the current function. So, this is pointing at the thing it's about to run. 00:36:59.480 |
It's about to run the thing called define F. So, we're now inside the definition of F. 00:37:03.880 |
And now it's going to run something for i in range 10. So, N is next. So, N just advances one 00:37:10.680 |
instruction. So, now that I've done that, i should exist. So, you can print the contents of something 00:37:16.280 |
by pressing P. And then the thing you want to print. So, i is now zero. And so, then I can go next. 00:37:22.200 |
And in fact, you don't even have to type N. If you just hit enter, it redoes the last thing you did. 00:37:27.880 |
So, that just jumps to the next line. And so, I can P i. Okay. Now it's one. And so, you get the idea. 00:37:36.360 |
So, basically, and then I can type L to list the file that I'm currently at. 00:37:45.480 |
I can also see W to see like what called this, which it was actually called in this case by 00:37:52.120 |
IPython or by Jupyter Notebook. So, this is how I always debug things. And I'm sure at some point, 00:37:58.920 |
we'll actually need to debug something. I thought I'd just quickly show you that. 00:38:03.480 |
Folks here who have used both the graphical and the regular Python debugger, do you have any 00:38:12.600 |
preferences or anybody here that has just used one or the other and likes it, doesn't like it? 00:38:18.440 |
I have only used the text debugger. Yeah. I love it. Yeah. It's wonderful. Especially learning 00:38:31.000 |
about, you know, doing the first AI course, you can just put self-trace wherever you'd like. 00:38:38.040 |
And you are immediately transported there. So, for instance, when working on a new architecture, 00:38:45.720 |
we're implementing some architecture of, I don't know, my own idea or trying to re-implement 00:38:51.240 |
something. I create my own class and then I can step through the shapes of the time source. 00:38:58.360 |
It's just super useful. Yeah. So, you mentioned set trace. 00:39:05.240 |
So, pdb stands for the Python debugger. So, set trace is really useful. It's how you set a break 00:39:15.640 |
point. It might seem like a weird way to set a break point. But basically, if we run this now, 00:39:23.320 |
we don't even have to say percent debug, it jumps into the debugger immediately after that set trace 00:39:28.680 |
call. So, you can put that not only in your own Python files but in Python files that you've 00:39:36.280 |
installed from pip or condor or whatever and then step through it in the way we just talked about 00:39:41.000 |
and hit N and start running through and check the values of variables, whatever. 00:39:45.320 |
Oh, I didn't say how to exit. To exit, you press Q for quick. 00:39:51.960 |
If you're learning a new library, this is super useful because you just put the library from GitHub, 00:39:58.200 |
you do pip at the template install and then you literally can step into the code that you're 00:40:06.040 |
reading about. So, like, this is. And also, basically, pretty much every 00:40:13.960 |
major programming language debugger works the same way. So, you can, yeah, if you're doing C 00:40:21.640 |
code, there's a debugger called GDB that works the same way. If you're doing Perl code, 00:40:26.440 |
the Perl debugger works the same way. They'll have the same keyboard shortcuts, the same way of 00:40:30.600 |
working. So, it's skills you can reuse. And that's another thing, like, in general, I, like, 00:40:37.640 |
really try to avoid, you know, unless they're really, really good. But in general, proprietary 00:40:44.840 |
tools, I generally avoid instead try to use tools that I can use everywhere. Because then I don't 00:40:50.920 |
have to learn as many things. I can learn a small number of things and reuse them all the time. 00:40:55.400 |
And particularly these, like, really old tools, like this, the way the Python debugger works 00:41:01.560 |
goes back a long time even before Python existed. These tools have been developed over many years 00:41:07.000 |
to make them really perfect, you know, really to make them work really well by many people. And so, 00:41:13.560 |
they're very nicely optimized once you learn them. And they do take some time to learn. 00:41:19.560 |
But if you're doing these walkthroughs, then you're the kind of person who's prepared to put 00:41:23.880 |
in that times. There's another thing related to what Jeremy just talked about. And those are key 00:41:31.560 |
bindings in things like Tmax or even in Jupyter Notebook that we're looking at right now. So, 00:41:39.400 |
my normal intuition and what I would do a couple of years ago when I jumped into something new, 00:41:47.800 |
oh, I would just come up with my own unique key bindings that, hey, they will make life 00:41:53.560 |
comfortable for Reddit. They make it, you know, they're ergonomic and they're easy to remember. 00:42:00.200 |
But then as you switch to a new environment, you sort of have to bring the key bindings with you, 00:42:05.640 |
which is a horrible pain. So, just like Jeremy mentioned that he tries to use software that is 00:42:13.320 |
readily available, a way to shoot yourself in the foot would be to come up with your intricate 00:42:19.400 |
key bindings. It's just sometimes very useful to go with the key bindings that are already there. 00:42:25.720 |
And even more importantly, learning to use the keyboard for everything is a good idea. I tend 00:42:37.560 |
to use the mouse a little bit for teaching because I want people to see what I'm doing. 00:42:41.240 |
But in normal life, I hardly ever touch my mouse because I'm just zipping around. 00:42:53.080 |
- Jeremy, just a question, slightly on a different topic, but on the same thing. 00:42:57.800 |
If the library behind this Notebook has changed or get upgraded, how do we get the latest? 00:43:05.000 |
- That's what I'm going to do right now. Okay. So, let's say I want to 00:43:11.320 |
upgrade something or install something in this environment, on this paper space 00:43:18.440 |
server. As we discussed, a paper space server is not really a server at all. 00:43:22.040 |
And so, if I pip or condor install something, it's actually not going to be here next time I come 00:43:29.320 |
here. So, that's a bit of a bummer. So, how do we fix that? We're actually going to learn a lot 00:43:38.600 |
in order to fix this. The first thing to know is that paper space has this idea of 00:43:43.240 |
persistent storage. And specifically, there's a directory called /storage, which contains your 00:43:51.080 |
persistent storage. And so, as you can see, even though I only just created this server 00:43:55.480 |
just now, there's already things in here. And that's because that's my persistent storage. 00:44:04.120 |
So, this is basically a mounted network drive. You can see all of the drives 00:44:10.280 |
and how much room you've got in each one by using DF, which is disk-free. And then, if you remember, 00:44:17.080 |
minus H is the human eyes. It tells you sizes in like gigabytes and megabytes and stuff. 00:44:22.600 |
And so, here's a list of all the disks that paper space has provided for me. And so, 00:44:30.520 |
there's one called /, which has got 168 gigabytes available. And here's my storage, 00:44:36.360 |
which has got 496 gigabytes available. So, by default, for free, you get 5 gig. 00:44:42.760 |
And it's still pretty good, right? But for 8 bucks, you get 15 gig, which is a hell of a lot better. 00:44:48.200 |
Not all of these are writable. So, for example, they have actually a /data sets 00:44:52.680 |
thing mounted there for you, which is kind of cool because you can actually start using 00:44:58.840 |
data sets that's used by Fast.ai straight away, which is pretty nice. 00:45:09.640 |
what are we going to do about this, you know, /storage? This is really where we want to install 00:45:17.880 |
pip libraries or conda libraries, too. So, let's -- I'm just trying to think. Anybody think of a pip 00:45:28.520 |
library they want to install? Any favorite ones? >> Use something like auto pip 8 or 00:45:40.200 |
Jedi or something like that. It doesn't really do much. >> I'm sorry. Maybe we'll just grab 00:45:49.160 |
the latest version of fast core. So, normally, to install the latest version of something, 00:45:59.480 |
so you can use pip or conda. For this, actually, for installing stuff kind of, like, locally, 00:46:07.560 |
the way we're describing it, it's going to be easier to use pip than conda. So, we use pip. 00:46:12.760 |
In a past lesson, I said, like, avoid pip. I think we're at a point where we can talk about 00:46:22.360 |
where it's okay to use pip. So, pip is a perfectly good way to install things which 00:46:31.960 |
just contain Python code or which are kind of pretty self-contained. You wouldn't normally want 00:46:39.000 |
to pip install PyTorch because it requires, like, CUDA and stuff. And, yeah, pip doesn't really 00:46:46.280 |
have a way of installing those kind of libraries. That's why if you use pip to install PyTorch, 00:46:52.520 |
you have to, like, separately install the software development kit. With conda, you don't have to. 00:46:58.680 |
But for a library like fast core, and, in fact, honestly, most libraries, you know, 00:47:04.600 |
like, GPU kind of libraries, pip's actually fine. And so, normally, to upgrade software with pip, 00:47:12.840 |
you would type pip minus U, and then you type the thing that you want to upgrade. Or if you just 00:47:18.600 |
want to install it, you do it without the minus U. There's an extra flag you can use which is minus 00:47:23.880 |
minus user. And that's going to install it into your home directory. And so, there's lots of 00:47:32.120 |
reasons you would want to do that. For example, if you don't have root access or, like, in our case, 00:47:37.880 |
we don't have the ability to, like, save the stuff in the root directory. So, if I run that -- oh, 00:47:46.280 |
and, of course, I have to say install. Okay. So, it's upgraded it from 1.4.2 to 1.4.3. So, 00:47:59.160 |
let's see if that actually works. >> So, Jeremy, why are you using -- like, is mamba not an option 00:48:06.280 |
for this? >> Yeah. So, this is -- it's not a great option for installing stuff into a user directory. 00:48:12.920 |
At least I'm less familiar with that. This is a way that I know is going to work fine 00:48:18.280 |
for this special situation where we want to put stuff into our home directory. 00:48:25.080 |
So, yeah, mamba and condor are kind of synonyms. Mamba is a faster way to do it. Whereas pip 00:48:34.280 |
is a different thing altogether. And it has this special thing I'm showing you right now, 00:48:39.160 |
which is --user. And if condor or mamba has such a thing, I don't know about it and haven't 00:48:46.040 |
learned how to use it yet. I'm not saying it doesn't exist. But at least for pip, this works 00:48:50.760 |
fine. So, if we now look at fast-cause version, there we go. So, it has, in fact, installed 1.4.3. 00:49:01.640 |
Now, where did it put that? So, here in our home directory, you can see it's actually created 00:49:11.640 |
something called .local. And .local is where pip puts stuff that you install with --user. 00:49:24.680 |
And as you can see, it's got various subdirectories. And here is fast-core. So, if we want to be able 00:49:34.920 |
to continue to use the latest version of fast-core next time, we start this notebook server. 00:49:43.560 |
We want this .local directory to still be there. Right? So, how do we do that? 00:49:55.560 |
Well, what we can do is we can actually put that into our storage. So, we could move that 00:50:12.600 |
into our storage. Now, okay, that's all very well. But it will now be in storage next time we come 00:50:24.120 |
back. But Python needs it to be here in our home directory. So, what do we do? Well, what we have 00:50:31.960 |
to do is we have to make it so that .local in our home directory and .local in our persistent 00:50:38.920 |
storage are the same thing. And the way we do that is something with something Radik was mentioning 00:50:43.800 |
before, which is using a symlink or a symbolic link. If I say ln for link and minus s for symbolic 00:50:50.840 |
and I say what's the thing that you want to symbolically link and I say it's /storage/ .local, 00:50:59.240 |
that's the thing I just moved. Then you'll find that in this directory, there is now a .local, 00:51:16.040 |
but it looks special. It looks different. And it's like saying, oh, it's not a folder at all. 00:51:20.280 |
It's actually just pointing at this other place. But it's like it really exists. I can ls it, 00:51:27.560 |
for example. I can cd into it. And remember to say the last token from the previous line, 00:51:34.680 |
if I said this before, is exclamation mark dollar. So, that'll be .local. You can see it does cd.local. 00:51:40.440 |
So, yeah, it's basically like a it's not a copy of it. It's like a shortcut into it. In fact, 00:51:48.920 |
I think on Mac they're called aliases and on Windows they're called shortcuts. It's the same 00:51:54.120 |
thing. And on Unix type things, it's called a symlink or a symbolic link. So, now, if I run 00:52:03.720 |
my Python again and check the version, yep, it's still 1.4.3. So, it's still finding it. 00:52:14.120 |
So, this way we can actually make sure we've got, you know, that we can install and upgrade packages 00:52:27.320 |
and still see them every time we launch, even if it's a new notebook server or relaunch an 00:52:31.880 |
existing one or whatever. We just have to make sure that every time we start a new paper space 00:52:38.840 |
instance that it creates any symlinks we want. And so, paper space has this really nifty thing, 00:52:47.400 |
which is you can create a file called .bash.local in storage and it will run that file 00:52:57.320 |
every time you start a notebook. And so, you'll see I've got a file there 00:53:05.720 |
that goes through and creates a symlink to .ssh and to .local and to .git credentials 00:53:13.560 |
and a bunch of stuff that we haven't talked about all of them yet and .caggle and symlinks them all 00:53:18.440 |
to /storage. And so, this way, every time I start a new computer, I'm going to have 00:53:28.440 |
all that stuff set up automatically, which is, yeah, I think is pretty great. So, that's how you can 00:53:39.880 |
customize your paper space instance. So, Jeremy, just to recap there to make sure I've got that 00:53:49.880 |
clear in my head and for everyone else too. So, essentially, what you've done is that you've got 00:53:53.880 |
this bash script that you keep inside your persistent storage, which contains all your 00:53:57.160 |
config and bits and pieces that you want. And then, every time you fire up a new instance, 00:54:03.320 |
you're just symlinking all that stuff that you've got in storage to the machine you've just created. 00:54:07.560 |
Yeah. And in particular, after I type pip install minus minus user something, 00:54:20.040 |
it's created this .local directory and that's something that I want to be persistent. So, 00:54:24.920 |
I move that into storage and then symlink it back to where it's meant to be. 00:54:31.320 |
And the reason that you're doing this is because you can't get access to the 00:54:38.440 |
root directory on their server. Like, would you need to do this on your own computer as well? 00:54:44.360 |
No, this is just for paper space. It's not that I can't access it, Mark. I can. I can install it. 00:54:49.480 |
But the problem is, these are not real servers. That's not persisted. So, if I went in five 00:54:57.080 |
hours' time when this shuts itself down and then I start up the server again, it's not there. 00:55:01.800 |
It's a mock server. It looks like it's your own server, but it doesn't actually 00:55:09.000 |
keep your changes unless they're in business. This is necessary only on virtual machines, 00:55:14.920 |
but on your own computer, you wouldn't need to do that. This is like just this one. This is just 00:55:20.440 |
paper space. This is just for paper space. And we're spending time talking about paper space 00:55:25.880 |
because it's so much better than any other option out there for GPU servers. They're the only ones 00:55:31.960 |
that have these nifty tricks. Yeah, on your own computer, you don't have to worry about any of 00:55:38.520 |
this stuff. And if you've got your own GPU, you certainly don't have to worry about it. 00:55:43.000 |
But, you know, there are other notebook servers like Google Colab or whatever, 00:55:47.800 |
but they don't have anything like this. So, on Google Colab, you're going to have to like 00:55:52.200 |
reinstall everything you need every time you start up a new notebook and, you know, 00:55:57.720 |
you won't have any of this proper environment. So, yeah, as you might have seen, even my SSH keys 00:56:05.480 |
are SIM-linked here. So, I'm always going to have my SSH keys any time I create a new paper space 00:56:13.240 |
instance. So, yeah, this is like a super convenient way to have a free GPU server whenever you want it, 00:56:24.440 |
which I think is pretty amazing. Jeremy, a question. I followed what you did in terms 00:56:31.320 |
of installing PIP installing the fast call. But then when I use Python and try to import fast call, 00:56:39.080 |
it throws an error. But when I do, I Python and import fast call, it can find it. Does it? 00:56:45.000 |
That's interesting. Do you want to share your screen and we could try to be back there? 00:56:58.200 |
I might have to stop sharing first. Let's see. Okay. I'll stop sharing. 00:57:28.120 |
Okay. So, let's have a look. So, this is on paper space and you went PIP install. 00:57:43.080 |
Good. And you went Python. Interesting. Okay. So, great. So, press control D to exit from my Python. 00:57:57.080 |
And you can press it again or hit enter. You didn't actually have to press Y. See how it's 00:58:03.800 |
in square brackets. That means it's the default. So, you can just do that. Okay. So, let's find out 00:58:08.040 |
what's going on. So, type which Python. Okay. So, okay. And then type which I Python. 00:58:31.240 |
I've got a strong suspicion. Try typing Python 3 instead of Python. Just type Python 3 or one word. 00:58:38.200 |
Not which Python. Okay. Now, try importing fast call. 00:58:46.440 |
Interesting. Let's see if I have the same problem on mine. So, Python import fast call. 00:58:59.320 |
Oh, I'm getting the same error on mine. Very interesting. 00:59:02.360 |
Okay. I'm going to share my screen again. Very well spotted. 00:59:16.200 |
So, this is exactly the kind of bug that I want us to have so we can learn how to hopefully fix it. 00:59:30.520 |
I wonder if -- because I hardly ever just run Python. And I've only recently started using 00:59:41.400 |
pip install user because it's -- because of this paper space thing. 00:59:50.360 |
So, I wonder if it's something specific to pip install user. So, let's see if we can debug this. 01:00:03.320 |
Actually, what's interesting is no module named fast core is actually very interesting because 01:00:20.920 |
that means it also doesn't have fast AI. Which -- yeah. Okay. So, the way 01:00:32.600 |
Python finds modules is a very similar idea of how bash finds executables. There's a path, 01:00:42.600 |
basically. And so, in Python, there's a module called sys which is where all kinds of things 01:00:52.440 |
are stored. And so, if we go sys. -- there's a sys.path. Now, this is not the -- this is not the 01:01:14.280 |
bash path environment variable. This is a totally separate thing with a similar name, 01:01:20.520 |
which is an all lower case path, sys.path in Python. This is a list of places that Python 01:01:25.480 |
will search for Python libraries. Now, if I import fast core, 01:01:37.320 |
and you can see it's getting it from opt-condolib Python 3.7 site packages. And you can see that is 01:01:48.520 |
in my sys.path. So, that's how it's finding fast core. So, why isn't Python finding it? 01:01:56.680 |
Sys.path. So, that's interesting. So, Python here is not including 01:02:16.040 |
site packages. Whereas, I Python is. So, there's something, I guess, about how 01:02:24.360 |
paper space have installed things. Because I'm pretty sure that's not what happens here. Let's 01:02:34.600 |
try it. Python. Import sys.path. Yeah. So, here's site packages. 01:02:45.400 |
So, let's see what happens if we -- site packages. So, this is like the normal place that PIP and 01:02:58.040 |
Condo install things is into the site packages directory. And yeah, I've never really looked 01:03:06.040 |
into it because I've never had problems accessing it before. Oh, something to do with Debian puts 01:03:21.960 |
things in dist packages. That's interesting. Site packages, not in path. 01:04:19.720 |
>> Jeremy, why is this talking to me? Hang on, Jake. 01:04:31.640 |
>> Just when you were looking at those two paths, one was 3.7 and one was 3.9. 01:04:38.360 |
>> Oh, I didn't even notice that. Is that true? You mean here? 3.9? 01:04:50.000 |
>> Oh, yeah. And 3.7. There you go. You're quite right. Thank you. Okay. So, that'll be the reason. 01:04:56.520 |
Which Python? Which IPython? Yes. Okay. Yeah. All right. So, it wasn't just a case of typing 01:05:09.160 |
Python 3. It was a case of typing Python 3.9. There we go. Oh, still not there. 01:05:18.280 |
Oh, it's 3.7 that IPython is using. Python 3.7. 01:05:30.040 |
I don't know why they've got so many Python installed. It seems a bit like overkill. 01:05:37.160 |
>> So, the Python 3.9 here was the system Python, right? And the Python 3.9 was the point? 01:05:46.120 |
>> I mean, because we're on paper space, I think they were all... 01:05:52.840 |
Which Python? Which Python 3? They're actually all the ones in Conda. So, it's... 01:06:05.880 |
So, paper space is installed. Conda is the root. And so, none of these are the system Python, 01:06:11.800 |
actually. Yeah, paper space is a bit unusual in that they have us run as root. 01:06:19.800 |
So, things are a little bit confusing, actually. 01:06:27.880 |
Yeah, now, as to why IPython is running 3.7, I'm actually not sure. 01:06:45.160 |
So, something else that I do is I create a Git directory. And then I Git clone things into it 01:07:03.560 |
using my SSH keys. And then what I do is I move the Git directory into /storage 01:07:13.240 |
and then Simlink it back. And actually, where I Simlink it to, I don't actually Simlink it to my 01:07:19.800 |
home folder. I actually Simlink it inside /notebooks. And the reason for that is that 01:07:31.800 |
That's where paper space uses as the root of its JupyterLab. 01:07:42.120 |
So, actually, you can see here I've done it before because it's in /storage, right? So, you can see 01:07:48.200 |
here's my Git stuff. And so, I actually think, you know, I don't really want any of this 01:07:59.640 |
stuff that they've put in here for me. So, actually, maybe I should try 01:08:04.360 |
deleting. In fact, let's try that. What happens if we... 01:08:08.840 |
...create a server and we make that Git repo thing empty because that's really what I want. 01:08:23.480 |
So, you've uploaded your SSH keys into paper space, right? 01:08:32.440 |
Yeah, I've uploaded them. And I've put them in /storage. 01:08:35.640 |
And in my /storage/bashrc.local, I Simlink them into my home directory. Correct. 01:08:47.080 |
Yeah. I mean, if you were paranoid about such things, then create a separate SSH key pair 01:08:55.160 |
just for this and put that in your GitHub. So then, people... If somebody steals your SSH private key, 01:09:03.960 |
the worst thing they could do is to get into your GitHub. 01:09:06.120 |
That's so cool. I didn't think about that. Wonderful. We'll do that. Thank you. 01:09:20.440 |
That's a bit overkill for notebooks at the moment. Let's delete some of these. 01:09:34.440 |
So, yeah, for me on paper space, you know, everything's kind of going into that /storage. 01:09:43.000 |
So, I don't really care about deleting things. 01:09:51.320 |
Will it let me delete this? Because that's really what I want to do. 01:10:03.880 |
So, I press delete. It's still showing me this. I don't know if that's a default or if it's just an example. 01:10:20.680 |
Well, I'm here. So, I just want to mention maybe... Sorry. Maybe I'm the only one. 01:10:27.080 |
I understand in principle what you're talking about with the SSH keys and 01:10:30.760 |
importing them and everything, but the details of the execution, if I'm the only one that's fine, 01:10:35.480 |
I'll struggle with it. Let's do that. I don't know that I could actually do it. 01:10:38.840 |
Yeah, let's do it. That's excellent. Thank you. One thing I just want to do for my own interest 01:10:45.960 |
is I'm just going to jump onto YouTube and see if anybody actually watches these live streams, 01:10:52.600 |
because if they don't, I won't waste my time running them for people watching them. 01:11:07.080 |
Yeah, not sure it's worth it. Might just use Zoom in the future. 01:11:10.920 |
Did you know your hands up, Radek, by the way? Yeah. 01:11:16.680 |
You don't have to put your hand up. You can just talk. 01:11:22.120 |
Okay. Okay. You know, some libraries, the more exotic ones, like I'm not sure maybe graphs this, 01:11:31.560 |
or they require you to install something via updates to some library. 01:11:39.160 |
Oh, yeah. Let's talk about that as well. Great. Okay. So, this thing has 01:11:45.080 |
successfully started to demachain. Let's see if there's anything in it. 01:11:50.680 |
So, I was just starting the machine when deleted the Git repo thing. Yeah, okay, great. So, 01:12:00.360 |
this is actually just empty. This is actually probably what I would be more inclined to do, 01:12:05.880 |
although I expected to see my /git there. Oh, wait. 01:12:20.040 |
Okay. All right. Here's an interesting problem. That .bashrc.local file, it runs when you run a 01:12:30.760 |
terminal. So, my git folder didn't appear until I actually opened a terminal. As soon as I did that, 01:12:38.680 |
it appears. And I probably hadn't noticed that before because I always run a terminal 01:12:46.440 |
as soon as I start pretty much. There is a way actually that what they actually run when you 01:12:56.040 |
start a notebook, when you start a server, is it actually runs this file, run.sh, which we can't 01:13:02.360 |
change. But it does actually have a prerun.sh file, which is if you put stuff in /storage/prerun.sh, 01:13:13.720 |
it will run before Jupyter starts, which maybe is actually a better place for all the stuff I'm 01:13:23.480 |
doing. Maybe that's what we should use instead of .bashrc.local, because this only runs when you run 01:13:29.640 |
a terminal. Yes. Interesting. Let's try that. Actually, I'd forgotten. It looks like I have got 01:13:47.080 |
local member stuff working as well. Maybe we can try that next time. So, by the way, to look at the 01:13:57.000 |
end of a file, you can just type tail. So, if I go tail/run.sh, there it is. So, if I move .bashrc 01:14:07.960 |
so .bash.local to pre-run.sh. All right. Let's try that. 01:14:33.640 |
So, if we now create a new notebook, if you're wondering why it is, by the way, that Paper Space 01:14:41.080 |
is so perfectly set up for everything to work really well, it's because I've basically been 01:14:45.400 |
nagging the perks of Paper Space for the last four years about all these things. And actually, 01:14:51.160 |
it's just really in the last three months that they actually really started listening. And I 01:14:56.040 |
told them, "Put this here. Put this here. Then it's going to be great." So, yeah, they've been 01:15:01.160 |
really great, particularly recently, at setting everything up exactly the way we need it. 01:15:07.400 |
Okay. So, delete that. And so, I think, yeah, see, here's that command, /run.sh. 01:15:19.640 |
So, I guess what you could do, by the way, is you could, like, put some different, 01:15:24.040 |
like, your own URL here. And it's going to, like, automatically put that in /notebooks. 01:15:29.960 |
And maybe you could even put a shell script then that comes from GitHub. I haven't really 01:15:34.600 |
thought about that. Anyway. Okay. So, I think it was Mark that was asking, "How would I actually 01:15:42.120 |
get my .ssh keys onto this machine?" I think the easy way to do it would be to use the upload 01:15:58.600 |
file feature in JupyterLab. This is a really handy feature to know about. So, you see this 01:16:03.800 |
little button here, upload files. So, you could click that, and then you could go into your .ssh 01:16:14.680 |
folder and find the files you want to upload and upload them. So, for example, I do config. 01:16:28.760 |
And you can see here it appears. And so, then if I open my terminal, 01:16:33.320 |
there it is there, right? And then you could just move that to where you need it. 01:16:43.320 |
One tip with .ssh keys, actually. In fact, let's do it from scratch. 01:16:51.320 |
Because that's what I'm meant to be doing. Let's do it from scratch to make sure everything works. 01:16:58.280 |
So, I'm going to rm.ssh. Okay. So, let's do it from scratch. 01:17:05.720 |
.ssh keys actually have to have very exact permissions on them. If it's possible for 01:17:13.080 |
anybody else to read or write your .ssh keys, .ssh will refuse to use them. 01:17:17.400 |
And so, one way to actually see the correct permissions is to create some .ssh keys. 01:17:26.040 |
So, I could go .ssh-keygen. Enter, enter, enter. And then I can go ls-la.ssh. 01:17:37.160 |
And so, to remind you, we just briefly see this the other day, the permissions. 01:17:42.840 |
These three here tell you this user, which is root, can they read, write, and execute the file. 01:17:52.040 |
So, this user, so the root can read, write, and execute the, this is the private key file. 01:17:56.920 |
And it can also read, write, and execute the public, sorry, read and write the public key file. 01:18:02.760 |
These three here is, what about everybody else? And this says everybody can read the public 01:18:11.160 |
key file, but they can't do anything to the private key file. And then . refers to the 01:18:17.320 |
current directory. So, the directory itself, only the root user can read, write, and execute 01:18:24.280 |
the directory. The idea of executing a directory might sound weird. It actually refers to seeing 01:18:30.120 |
what is in a directory. They call executing a directory. So, let's upload my keys. 01:18:42.840 |
Okay. So, there they are. Now, they're going to be put into /notebooks/git. 01:18:47.960 |
So, if I go cd.ssh, and then I'll move /notebooks/git/idrsa. Now, if I hit tab again, 01:19:00.440 |
it'll show me that there's multiple things starting with those letters. If I say star, 01:19:06.200 |
that refers to everything starting with those letters. So, I'm going to move all of those things 01:19:10.600 |
into the current directory. So, the current directory, remember, is .. So, .slash. And so, 01:19:17.720 |
there they now are. And now, they don't have the right permissions anymore. 01:19:24.360 |
My private key is readable by everybody, which is no good. 01:19:29.400 |
So, to change permissions, we say chmod change. I don't know why it's called mod, 01:19:36.440 |
rather than chperm or something. And we can say that the group and the user should not have read 01:19:47.000 |
permissions. So, the user and the group subtract read permissions on the private key. And then, 01:19:56.440 |
check again. Oh, I shouldn't have said user and group. What I meant to say, it just removed 01:20:09.240 |
permissions for myself to read it. I should have said group and everybody, which I think is all. 01:20:25.400 |
So, Jeremy, the first three dashes are for user. The next three dashes are for group. 01:20:31.880 |
The first dash is for directory or not directory. But the next three dashes, yeah, go on. The next 01:20:37.000 |
three dashes are for user. The next three dashes are for group. And the last three dashes are for 01:20:44.680 |
everyone. That's correct. Okay. That's what we want. So, now, the user can read and write the private 01:20:53.400 |
key. And everybody, the user can read and write the public key. And everybody can read the public key. 01:21:00.280 |
So, we can test this by SSHing to GitHub.com. And GitHub.com expects you to log in with a username 01:21:10.280 |
git. So, when you SSH before the @ sign, you say the username to log in as. And by default, it 01:21:16.920 |
uses your current username, just root. I definitely can't log in to GitHub.com as root. GitHub.com. 01:21:25.640 |
Yes. Great. Hello, jph. So, it knows who I am, right? Because it knows who has my public key 01:21:37.160 |
in that account. You've successfully authenticated. And then it closes it. Because you can't actually 01:21:42.680 |
use a terminal on GitHub.com. It's only used for Git. But you can see my key is working. 01:21:47.240 |
Wouldn't it be simpler or am I missing something to generate a new key 01:21:53.960 |
in paper space rather than import it and then just give GitHub that new key? 01:21:57.800 |
Maybe. I don't know. I'm just thinking with all these changing of permissions and stuff. 01:22:08.680 |
I'm going to say, like, okay. So, obviously, I don't think so because I don't do it that way. 01:22:20.360 |
But if I think about why I don't do it that way, like, some people do it your way. 01:22:24.600 |
Your way is in many ways more correct in that you would have different public keys on GitHub.com 01:22:36.600 |
for every machine you're using. And if somebody, like, stole a machine, you could delete 01:22:44.280 |
just that public key. And that person now couldn't log in. But you could still log in. 01:22:49.960 |
And maybe that's more convenient or something. It's a perfectly fine way to do it, Mark, honestly. 01:22:55.960 |
I don't like the mental overhead of having to think about having multiple keys and which is 01:23:03.000 |
which. I've had a GitHub account for quite a long time and probably used, I don't know, 01:23:08.600 |
maybe 100 different machines to access it. And I don't like the idea of having 100 public keys 01:23:12.840 |
and thinking where are they and should they still be there. But, yeah, I think it's fine. 01:23:17.160 |
All right. So, that was actually pretty intense today. 01:23:25.960 |
So, for folks who, you know, are just getting started, there was nothing we used today I don't 01:23:32.600 |
think that we haven't learned how to use before. But it's tough using things that you've only just 01:23:41.000 |
learned about. And so, therefore, you know, it does need a lot of practice. So, if you're kind 01:23:47.080 |
of new to this, then, yeah, then, like, you probably want to rewatch the video and, like, 01:23:53.000 |
also pepper me with questions next time. If you try things and it doesn't work. 01:23:59.720 |
Or you're not sure why we do it or whatever. All right. Anything else before we? Yeah. 01:24:06.680 |
Yeah. What about these things you have to start? Oh, yeah. Okay. Let's do that next time. 01:24:17.960 |
Yeah. Let's do that next time. I will put it on the forum. Thanks, so nice to see you all.