Back to Index

Live coding 3


Chapters

0:0 Catch up Questions from last session
6:11 `settings.ini` and fastbook setup (more advanced)
8:19 The `$PATH` environment variable
12:22 Creating and using a conda environment
18:27 Creating a Paperspace notebook
33:12 The python debugger
43:8 Installing pip packages into your home directory
49:21 Persistent storage, mounted drives, and symlinks
56:27 Paperspace have different python environments by default
69:34 Creating a Paperspace notebook with everything set up automatically
76:35 Copying SSH keys to Paperspace to communicate with github

Transcript

All right. Does anybody have anything? Yeah, they wanted to ask about or talk about or anything else before I had a quick question. Yeah, it's quick. Um, when we're talking yesterday and I asked you about the environments, you seem to feel very strongly that you should work in the base environment and I've been rolling it over in my head.

And I'm just when I think about the mistakes that I've made and how I've screwed up environments and gotten conflicts and stuff like that, I was wondering why you feel so strongly about that. Sure. I mean, we'll talk about it more when we kind of get to environments, but because we haven't discussed them yet, but, you know, briefly, you know, environments are basically separate folders with separate installations of Python and Python libraries and so forth.

They're often used for kind of keeping separate projects separated with different sets of kind of dependencies or versions of Python or whatever. And they certainly have a role to play for advanced users. I almost never use them. I mean, very, very occasionally use them. But my feeling is the most important thing is to be able to rapidly iterate and experiment.

And I kind of want my projects to live together as a as a whole, as a bunch of things which all help each other and come together. So I don't like the idea of like, oh, I'm working on this project now. I go over there and everything's kind of new, you know.

So, instead, I really like to get very fast and very good at just quickly just going RM minus RF mini forge and it's gone and run set up condor.sh and it's back and have a single script that if I need one, I don't even need a script, I'll just go mambor install -c fast chan fastbook and that installs everything that I need and I'll go.

So I kind of like never want to be in a situation where anything on my computer is I don't really like it's working, but I don't know how I got to a point that it's working and I don't want to touch anything lest I undo that, you know. So I'm more in the kind of chaos monkey side of like explode things from time to time intentionally and be really good at putting them back to where they were, I guess.

Nowadays, I never have problems, basically, with dependencies or weird things going on in Python or whatever because I just can type, you know, like probably every few weeks, I'll just throw it away and install it just to try something out for teaching or whatever. I always feel fine. This morning, I did use an environment because I specifically wanted to test something on a different version of Python and I wanted to check that it would install into somebody's fresh new environment and so I used it for that.

I think it's useful if you are like installing some library where they've done a poor job of keeping their dependencies up to date. So you need like Python 3.6 and sentence p1.8 and I don't know, old versions of things in which case you want to be able to go and get all these exact versions of things.

But my approach is to, for my projects, is to not pin versions, not pin dependencies. I want anybody to be able to install my work into whatever they're doing and work with all their other programs that they're running and libraries that they're using without anything getting messed up. Unfortunately, not everybody works that way, but that's how I, you know, try to make other people's life easier and so therefore any programs you use from me, you'll be able to install into your base environment without messing anything up or in store into any environment without messing things up.

>> And when -- sorry, just a quick follow-up. If you're installing into like a new computer or whatever, would you use -- would you install fastbook or would you install fastai? >> It depends. I would just probably install fastbook because fastbook installs fastai, which installs NumPy, pandas, Matplotlib. It also installs transformers, data sets, sentence piece.

I think everything except sentence piece. There's no reason it shouldn't install sentence piece. >> It didn't yesterday. >> Yeah. I was just remembering that. >> So, Jeremy? >> Yeah. >> So, if you're blowing it away and you're basically using a new OS install as like people do with environments, how are you keeping track of your things like RSA keys, et cetera?

How are you not blowing those away? >> Those are not part of a contour environment. So, that's fine. They sit there in my home directory. It's just that many-forged directory or anaconda directory depending on what you're using. I just delete that. >> Cool. >> Jeremy, you talked about uninstalling always in this environment.

Yesterday I was trying to -- I messed up one of the dependencies. What are the steps for uninstalling usually? >> Go to your home directory and type RM minus RF manberforge. >> Only that is required. Okay. I tried that. >> And then close your terminal and reopen it. I remember the other day, Archie didn't do that step and so I didn't install properly.

I'll show you a quick trick. Yeah. Sorry, this is a little more advanced than normal, but that's okay. So, this is slightly confusing, but the fast book pi pi and condo installer actually comes from a repo called cost 20. And it's here. It doesn't really contain any code. It contains a few utils, but that's actually like search images being -- this has got nothing to do with what we're using it for, something that returns an image of a cat.

But actually, the key thing is it's got a settings dot any file which contains a list of requirements. And so, if I now put this up on pi pi and condor, then if I say condor install or pip install fast book, then that's one quick way of just getting all these.

Or you could create a tiny little script that goes, you know, condor install and then with these things in it. But, yeah, you want some way to get yourself into the basics set up instantly. Oh, here we are. This is why it didn't give me sentence piece. Sentence piece only comes with pip.

And that's because when I set this up, I didn't have fast Chan. And so I didn't have sentence piece in condor. >> Jeremy, I think you've got minimum Python there, 3.6, but I think fast AI repo has 3.7 in it. >> Yeah, yeah, that's right. Which I suspect it probably overrides.

But, yeah, so here's a good use of the GitHub GUI, right? I want to just change it while I'm looking at it. And we're done. Cool. Yeah. Okay. So, that's a good question. And, you know, we'll -- at some point, I'm sure we'll need to create an environment for something.

And we'll talk more about that. I guess, like, maybe something else just -- well, since we are, as I said, this is a bit more advanced. And people can totally skip this. But just -- I mean, it's probably worth understanding what condor/mamba is and how it works, right? So, you know, remember the other day, I typed which Python.

And I saw that I'm getting -- that Python is coming from this directory. So, like, one obvious question is, well, how does -- why is it coming from this directory? And the reason why it's coming from this directory is -- let me just open up this a bit more so I can see more people.

There we go. Is that Linux -- I mean, I shouldn't say Linux, you know, Bash and pretty much all shells. They use the concept of something called the path. And the path is the list of places to look for programs to run. And the path lives in something called an environment variable.

And an environment variable is just like a Python variable, but it's a variable that lives in your shell. And you can -- you can print them out. So, instead of print in your shell, you type echo. And then an environment variable -- normally if I just say echo something, it just prints it, right?

So, if I want to echo the contents of an environment variable, I have to put dollar before a dollar means this is a variable that I'm printing. And so, the variable path -- there it is, right? And so, you can see that this is a string -- a colon-separated string.

And in my colon-separated string, there's something which is home jhoward-mamba-forge-bin. And so, that directory, if we take a look at it, contains lots of programs. And one of those programs is PyPyPython. Okay? So, that's why it is when I type Python that -- oops, I didn't mean to do that.

When I type Python, that's the Python that it runs. So, here's a little trick. I want to type which Python. And I'm so lazy, I couldn't even bother typing Python. So, if you remember, double exclamation mark means the previous command. So, that's going to be which Python. So, it's worth looking and seeing, like, well, what is this mamba-forge directory?

So, the mamba-forge directory, for those of you that have kind of seen UNIX-type directories before, it contains a bin directory and an et cetera directory and a lib directory. And this basically looks very similar to my root directory. And so, basically, you know, a condor or mamba-forge or whatever root directory is kind of a copy of Linux or even actually a Mac root directory.

Contains very similar things, et cetera, user, and so forth. And what happens is that the thing that it puts into our bashrc, the script that automatically gets run, this thing here. It runs a little shell script that sets some environment variables. And one of the environment variables it sets, for example, is the path environment variable.

And it adds this to path. And it does something similar to kind of make all the libraries work as well. And so, we mentioned how you can create a totally separate, you know, environment, a totally separate place you can work that has its own copy of Python and libraries and stuff.

The way you do that is you go mamba-create-n, give it a name, and then say, what do you want to have in it? So, let's say, OK, I want to have Python in it. I don't normally like to have the latest Python. So, let's get something before 3.10. And I also want fast-forward in it.

So, that's going to create, so you can go mamba-create or condor-create. I actually already have that because I used it this morning, as I mentioned. So, I'll remove that automatically and create a new one. And so, that's going to set up a new environment, which we will take a look at.

So, currently, what it's doing is it's downloading from the internet a list of all of the condor packages that are available from a channel called condor-forge, which is the main channel that mamba-forge uses. And it says, OK, I'm going to install Python and fast-core. To install those things, I'm going to need these other things as well.

That sounds fine. You'll see it's cached. So, basically, one of the nice things about mamba and condor is that it kind of saves the archives that you've downloaded. It doesn't have to redownload them. So, as it now says, you can activate this environment by typing mamba-activate-temp or condor-activate-temp. So, that's changed my shell.

If I now say which Python, it's getting it from a new place. And it's getting it from the same place as before, home jhoward-mamba-forge. But it's now getting it from mvstemp. And that's because this mamba-forge directory has a directory called envs. And that envs directory is a folder that contains each of those environments.

And it's really interesting to see what's in them. Because, look, it's yet another copy of the kind of things you would see in the root of a Linux installation. So, that's why it works, right? It's basically yet another copy. So, for example, we'll see that in mvstemp-bin, here's another copy of Python.

So, if I type Python, it's running that Python. And it's got its own set of libraries. So, it's using those libraries. So, it's -- yeah, it's really neat. And you can install compilers. You can install, you know, any binaries you like. You can install Rust, you know, a separate copy of Jupyter, whatever.

By the way, something that's quite neat, not as important as it used to be, but these are actually using something called hard links to create these. So, they're actually not even separate copies. So, it's like not even using disk space. So, yeah, the whole thing is really quite nifty.

So, yeah, so, basically, when you go activate, it's -- in fact, let's take a look at my path. It changes my path, see? So, now, this comes first. >> Maybe we could look at hard links. I find hard links quite useful for myself. When I have a lot of data in a folder, and I want to run something on this data from another place, I just create a hard link.

>> Do you create sim links or hard links? Because normally you'd use sim links for that. >> Yes, that's the word. That's the wrong expression. I create soft links. >> Sim links, yeah. Yeah, we will get to sim links. Let's wait until we kind of need them, maybe, yeah.

Okay. So, to go back to activating the base directory, I just type "con" to activate, and now I'm back in my main environment. So, yeah, hopefully that explains a little bit about what environments -- and why you might use them. There's a certain way of developing software, which is particularly common in the JavaScript world, where you freeze the exact versions of everything at a particular point in time, and so you end up with things like -- well, in the Ruby world, you end up with a gem.block file.

In the Python world, you end up with a requirements.txt file. In the JavaScript world, you end up with your packages.json file. You know, this is something that I would strongly recommend trying to avoid as a data scientist when you freeze particular version numbers. It makes it almost impossible to mix and match things from different places, you know, this library from here and this thing from here, and, you know, you end up going into this huge complex ecosystem of Docker containers and, you know, trying to find ways to make that all work can get quite overwhelming, and you can actually entirely avoid it by just, you know, using a single base environment and keeping your libraries up to date and having good tests and knowing when a release has broken something and so forth.

You know, it's not always the way, but this is my suggestion for, you know, rapid iteration data science is to do things this particular way. All right. So then we've got our own computer running, and it's nice to be able to use Python on your own computer because, you know, you can rip it out of a laptop anywhere, you don't have to be on the internet, you don't have to start a server somewhere, it's nice to be able to quickly play with things.

And, you know, I think, like, ideally, a large amount of the time you're not using the GPU because a large amount of the time hopefully you're, like, exploring results or you're testing out things in really small samples that don't need a GPU or, you know, hopefully you can do a lot of stuff on your computer.

At some point, you need a GPU. And my view is that you should try to use a GPU in a way that feels as much like your computer as possible, but doesn't cost you much, if any, money. So, at the moment, my view is by far the best option for that is paper space.

Paper space is actually a company that have a few different products and specifically it's a product called Gradient. Gradient is, in fact, specifically it's Gradient Notebooks. So, Gradient Notebooks is basically something where you can get a free GPU server, which behaves a lot like what we've just been working with, you know, you'll get a terminal and all that stuff.

So, let me sign in. Okay. So, paper space has this concept of projects. I have no idea what they're useful for. I just have one project, so I'll just go ahead and click on it. They're just, they're the things that contain your, they call them notebooks, but these are basically servers, right?

These are some servers. Now, I, there's a few options for, like, paying their money. And if you can afford it, you know, this is such a good deal, the $8 a month. Not only because, as you'll see, you get some pretty good GPU options, and you can keep things private, but you also get more persistent storage.

So, that means you can store things between sessions. Now, the reason this is really important is because these aren't actually my servers. Paper space has not put aside servers for me to use. These are kind of small little saved snapshots, basically. And it's going to kind of create a new computer each time I fire one of these up.

And so, it's really nice that, as you go from, you know, instance to instance to be able to access the same files each time. So, let's start from scratch, because that's what we're doing. Okay, so, it says select a runtime. Basically, what this is going to do is it's just going to pre-populate your server with some files.

And so, if you choose the fast AI one, then you'll have the main stuff, you know, basically everything you need for the book pre-installed. So, let's do that. And so, as you can see, there's various free options and various paid options. So, I'll use there. So, basically, you know, important things to know about is how big is the GPU?

These are all pretty good. 8 or 16 is great. 16 is obviously better. And then, how fast is it? P, that'll probably be a Pascal card. So, that's a couple of generations old. So, it's like quite a lot slower than modern cards. RTX is totally up-to-date card. But this one's got a lot more GPU.

So, I'm just going to pick this one. Six hours. So, it's going to, you know, if you're paying for it, make sure you've got auto shutdown set to something sane. Otherwise, you'll end up paying for it for a long time. You can easily share notebooks with other people by turning public access on, which is by default.

There's a few advanced options here. I don't think we particularly need to touch them, to be honest. One thing I'm just going to note now is that it's going to run a command called run.sh. So, just note that down because we're going to check it out later. And you'll also see it's actually going to clone a git repo.

So, I mean, one thing you could do is if you've got a fork of fastbook, then replace fast.ai with your username and you're going to get your forked version. Okay. So, I'll start. So, yeah. So, I don't know. I find this a bit confusing that it says notebook. It's not a notebook, right?

It's starting a server for us. And that server is going to run Jupyter Notebook automatically. So, the thing that appears here is the paper space GUI. I don't love it, honestly. So, I don't really use it very much. The one thing that you do particularly want it for, though, is to be able to stop your server when you're finished.

Especially if you're paying for it. I mean, you should do it anyway because there's no points using their server hours. So, what I'm going to do is I'm going to just copy this URL and create a second tab and paste it just so that I've got two versions of that.

So, this one here is just going to be sitting here and I can go back to it and click stop later. In fact, when I close this tab, it will remind me that I have to click stop. So, this is a good way to not accidentally forget to stop your server.

That auto shutdown, it happens if you're inactive or that would shut down regardless. That shuts down regardless. Because they don't really know if you're doing things. They don't really have any telemetry or anything. Oh, by the way, this five hours seems to be truncated down. So, it's actually 5.9 hours.

That's just something I noticed. It's a bit of a bug, I guess. Yeah, so, in five hours' time, it's going to shut down regardless. So, the first thing I do actually is I click this button, which gives us proper JupyterLab. And then I don't have to use the slightly crummy GUI anymore.

And this is also nice because now we're going to be using something that's just like what we have on our computer, which is the goal. Okay. So, here's JupyterLab. And you can see that the book is here. And yeah, this is basically the Git repo that was automatically filled in for us as being cloned into here.

Just what I'm going to do is start a copy of an old machine as well. Not gradio. What am I doing? Gradient. Because I want to access some files from there. Start machine. Okay. So, I guess to start with, we could go into clean, open up mnest basics. So, let's see how much they've got installed, see if it's all ready to go.

Let's try running this cell. Well, there we go. It looks like it's got everything. Let's try running this cell. Nice. Okay. So, it's basically got fastbook installed and sentence piece installed. So, that's good. Sorry, Jeremy. Are we checking JupyterLab or are we checking the paper space? We are in paper space right now, see.

So, just to remind you, I click on this button and that gives us JupyterLab in paper space. Thank you. Sorry, I missed that. No, no problem. It's easy to miss things. Ask anytime. So, one thing that is actually I find kind of confusing about JupyterLab is it has its own set of tabs in its own interface and it kind of replicates things like that could be in a browser.

So, in a lot of ways, I kind of prefer the old version of Jupyter, Jupyter classic, which you can always switch to. But, you know, you can get used to it. And one thing that helps a lot is if you just full screen this, right, and kind of know the keyboard shortcuts.

So, control shift left and right square brackets switch between tabs. And that's the main one to know. And control B turns on and off the sidebar. So, this way, at least, you can, like, get a nice, you know, good reason for screen, particularly when I click terminal. So, if I click terminal here, that's not bad, right?

I've got plenty of room to see my terminal. So, that's nice. Okay. So, I don't -- >> Sorry, Jeremy. Just on the bottom there, if you want to get rid of those tabs for any reason, there's that little switch that says simple. That will hide those tabs. >> Yeah, that actually gets rid of the tabs as well, which I'm actually using the tabs.

But what you can do is you can go remove status bar. It gives you a bit more room as well. So, yeah, now we're actually doing pretty well. And one particularly nice thing in Jupyter, by the way, is it actually has a graphical debugger, which, you know, so if we go for I in range 10 print I, and then we turn on the debugger with this little button here.

So, we can put a breakpoint here on and off by just clicking. And so, now, if I run this cell, you'll see that it's got a breakpoint, which is very nice. And we can -- got a lot of things in here, doesn't it? Why is -- there we go.

>> Music. What? >> That one. Okay. So, you can see, like, here's I. And so, if I now step through this, there's a better way to just show what we want. Step. Okay. So, it's kind of like -- yeah, it's -- that's kind of a useful thing to have, I think.

Yeah, I guess this would probably be easier if this is actually probably a really good place to not use import star, because I don't see an obvious way to only add variables we want to the debugger. So, let's restart the kernel. Okay. And then run this cell. There we go.

That's much better. So, now, we can just see that variable changing. You might be wondering why it is that I clearly am not very competent using the graphical debugger. And that's because I don't use it myself, because I actually really like the non-graphical debugger, which I'll quickly show you.

The non-graphical debugger you can use anywhere. Jupyter doesn't have to be Jupyter. It can be in a terminal or whatever. But inside Jupyter, if you just put percent debug at the top of your cell, it runs the regular Python debugger, which is a -- it's a repo, print debugger.

And you can type H for help to find out what you can do. And basically, you can type just the first letter of any of these if they're unique by first letter. You can see, actually, the ones which have the first letter. So, C is short for continue, H is short for help, and is short for next, P is short for print.

So, the single-letter ones are short for, like, the ones that you use all the time. And I always use the single letters, because, you know, why wouldn't you? So, for example, L -- actually, I'm not really in a file, so that won't work. So, let's try, for example, we can do N for next, so that just N goes to the next line.

So, here we are. So, we've now gone into the, you know, the code that we have in our cell. So, we should now be able to -- oh, next again. This is really weird. Why is this not -- Must be something to do with -- I wonder if this is some weird gputter lab thing.

Yeah. Okay. I think what happened was that, because I had used the graphical debugger, it broke the normal debugger. Okay. So, let's start again. So, I hit N for next, and that's still not really working. Okay. No worries. Let's switch to regular gputter, because I know it'll work there.

Okay. And here we are. Okay. Percent debug for i in range 10, print i. Now, curious. What if I put this in a function? Oh, okay. I don't know. I pretty much always debug things that are in functions. So, that's what's going on. Okay. So, I created a little function.

I put percent debug. I called the function. And then the first thing I did is I typed S. S steps into the current function. So, this is pointing at the thing it's about to run. It's about to run the thing called define F. So, we're now inside the definition of F.

And now it's going to run something for i in range 10. So, N is next. So, N just advances one instruction. So, now that I've done that, i should exist. So, you can print the contents of something by pressing P. And then the thing you want to print. So, i is now zero.

And so, then I can go next. And in fact, you don't even have to type N. If you just hit enter, it redoes the last thing you did. So, that just jumps to the next line. And so, I can P i. Okay. Now it's one. And so, you get the idea.

So, basically, and then I can type L to list the file that I'm currently at. I can also see W to see like what called this, which it was actually called in this case by IPython or by Jupyter Notebook. So, this is how I always debug things. And I'm sure at some point, we'll actually need to debug something.

I thought I'd just quickly show you that. Folks here who have used both the graphical and the regular Python debugger, do you have any preferences or anybody here that has just used one or the other and likes it, doesn't like it? I have only used the text debugger. Yeah.

I love it. Yeah. It's wonderful. Especially learning about, you know, doing the first AI course, you can just put self-trace wherever you'd like. And you are immediately transported there. So, for instance, when working on a new architecture, we're implementing some architecture of, I don't know, my own idea or trying to re-implement something.

I create my own class and then I can step through the shapes of the time source. It's just super useful. Yeah. So, you mentioned set trace. So, pdb stands for the Python debugger. So, set trace is really useful. It's how you set a break point. It might seem like a weird way to set a break point.

But basically, if we run this now, we don't even have to say percent debug, it jumps into the debugger immediately after that set trace call. So, you can put that not only in your own Python files but in Python files that you've installed from pip or condor or whatever and then step through it in the way we just talked about and hit N and start running through and check the values of variables, whatever.

Oh, I didn't say how to exit. To exit, you press Q for quick. If you're learning a new library, this is super useful because you just put the library from GitHub, you do pip at the template install and then you literally can step into the code that you're reading about.

So, like, this is. And also, basically, pretty much every major programming language debugger works the same way. So, you can, yeah, if you're doing C code, there's a debugger called GDB that works the same way. If you're doing Perl code, the Perl debugger works the same way. They'll have the same keyboard shortcuts, the same way of working.

So, it's skills you can reuse. And that's another thing, like, in general, I, like, really try to avoid, you know, unless they're really, really good. But in general, proprietary tools, I generally avoid instead try to use tools that I can use everywhere. Because then I don't have to learn as many things.

I can learn a small number of things and reuse them all the time. And particularly these, like, really old tools, like this, the way the Python debugger works goes back a long time even before Python existed. These tools have been developed over many years to make them really perfect, you know, really to make them work really well by many people.

And so, they're very nicely optimized once you learn them. And they do take some time to learn. But if you're doing these walkthroughs, then you're the kind of person who's prepared to put in that times. There's another thing related to what Jeremy just talked about. And those are key bindings in things like Tmax or even in Jupyter Notebook that we're looking at right now.

So, my normal intuition and what I would do a couple of years ago when I jumped into something new, oh, I would just come up with my own unique key bindings that, hey, they will make life comfortable for Reddit. They make it, you know, they're ergonomic and they're easy to remember.

But then as you switch to a new environment, you sort of have to bring the key bindings with you, which is a horrible pain. So, just like Jeremy mentioned that he tries to use software that is readily available, a way to shoot yourself in the foot would be to come up with your intricate key bindings.

It's just sometimes very useful to go with the key bindings that are already there. And even more importantly, learning to use the keyboard for everything is a good idea. I tend to use the mouse a little bit for teaching because I want people to see what I'm doing. But in normal life, I hardly ever touch my mouse because I'm just zipping around.

So, yeah, there's a few tips. Okay. - Jeremy, just a question, slightly on a different topic, but on the same thing. If the library behind this Notebook has changed or get upgraded, how do we get the latest? - That's what I'm going to do right now. Okay. So, let's say I want to upgrade something or install something in this environment, on this paper space server.

As we discussed, a paper space server is not really a server at all. And so, if I pip or condor install something, it's actually not going to be here next time I come here. So, that's a bit of a bummer. So, how do we fix that? We're actually going to learn a lot in order to fix this.

The first thing to know is that paper space has this idea of persistent storage. And specifically, there's a directory called /storage, which contains your persistent storage. And so, as you can see, even though I only just created this server just now, there's already things in here. And that's because that's my persistent storage.

So, this is basically a mounted network drive. You can see all of the drives and how much room you've got in each one by using DF, which is disk-free. And then, if you remember, minus H is the human eyes. It tells you sizes in like gigabytes and megabytes and stuff.

And so, here's a list of all the disks that paper space has provided for me. And so, there's one called /, which has got 168 gigabytes available. And here's my storage, which has got 496 gigabytes available. So, by default, for free, you get 5 gig. And it's still pretty good, right?

But for 8 bucks, you get 15 gig, which is a hell of a lot better. Not all of these are writable. So, for example, they have actually a /data sets thing mounted there for you, which is kind of cool because you can actually start using data sets that's used by Fast.ai straight away, which is pretty nice.

Yeah, they're the main ones, basically. So, what are we going to do about this, you know, /storage? This is really where we want to install pip libraries or conda libraries, too. So, let's -- I'm just trying to think. Anybody think of a pip library they want to install? Any favorite ones?

>> Use something like auto pip 8 or Jedi or something like that. It doesn't really do much. >> I'm sorry. Maybe we'll just grab the latest version of fast core. So, normally, to install the latest version of something, so you can use pip or conda. For this, actually, for installing stuff kind of, like, locally, the way we're describing it, it's going to be easier to use pip than conda.

So, we use pip. In a past lesson, I said, like, avoid pip. I think we're at a point where we can talk about where it's okay to use pip. So, pip is a perfectly good way to install things which just contain Python code or which are kind of pretty self-contained.

You wouldn't normally want to pip install PyTorch because it requires, like, CUDA and stuff. And, yeah, pip doesn't really have a way of installing those kind of libraries. That's why if you use pip to install PyTorch, you have to, like, separately install the software development kit. With conda, you don't have to.

But for a library like fast core, and, in fact, honestly, most libraries, you know, like, GPU kind of libraries, pip's actually fine. And so, normally, to upgrade software with pip, you would type pip minus U, and then you type the thing that you want to upgrade. Or if you just want to install it, you do it without the minus U.

There's an extra flag you can use which is minus minus user. And that's going to install it into your home directory. And so, there's lots of reasons you would want to do that. For example, if you don't have root access or, like, in our case, we don't have the ability to, like, save the stuff in the root directory.

So, if I run that -- oh, and, of course, I have to say install. Okay. So, it's upgraded it from 1.4.2 to 1.4.3. So, let's see if that actually works. >> So, Jeremy, why are you using -- like, is mamba not an option for this? >> Yeah. So, this is -- it's not a great option for installing stuff into a user directory.

At least I'm less familiar with that. This is a way that I know is going to work fine for this special situation where we want to put stuff into our home directory. So, yeah, mamba and condor are kind of synonyms. Mamba is a faster way to do it. Whereas pip is a different thing altogether.

And it has this special thing I'm showing you right now, which is --user. And if condor or mamba has such a thing, I don't know about it and haven't learned how to use it yet. I'm not saying it doesn't exist. But at least for pip, this works fine. So, if we now look at fast-cause version, there we go.

So, it has, in fact, installed 1.4.3. Now, where did it put that? So, here in our home directory, you can see it's actually created something called .local. And .local is where pip puts stuff that you install with --user. And as you can see, it's got various subdirectories. And here is fast-core.

So, if we want to be able to continue to use the latest version of fast-core next time, we start this notebook server. We want this .local directory to still be there. Right? So, how do we do that? Well, what we can do is we can actually put that into our storage.

So, we could move that into our storage. Now, okay, that's all very well. But it will now be in storage next time we come back. But Python needs it to be here in our home directory. So, what do we do? Well, what we have to do is we have to make it so that .local in our home directory and .local in our persistent storage are the same thing.

And the way we do that is something with something Radik was mentioning before, which is using a symlink or a symbolic link. If I say ln for link and minus s for symbolic and I say what's the thing that you want to symbolically link and I say it's /storage/ .local, that's the thing I just moved.

Then you'll find that in this directory, there is now a .local, but it looks special. It looks different. And it's like saying, oh, it's not a folder at all. It's actually just pointing at this other place. But it's like it really exists. I can ls it, for example. I can cd into it.

And remember to say the last token from the previous line, if I said this before, is exclamation mark dollar. So, that'll be .local. You can see it does cd.local. So, yeah, it's basically like a it's not a copy of it. It's like a shortcut into it. In fact, I think on Mac they're called aliases and on Windows they're called shortcuts.

It's the same thing. And on Unix type things, it's called a symlink or a symbolic link. So, now, if I run my Python again and check the version, yep, it's still 1.4.3. So, it's still finding it. So, this way we can actually make sure we've got, you know, that we can install and upgrade packages and still see them every time we launch, even if it's a new notebook server or relaunch an existing one or whatever.

We just have to make sure that every time we start a new paper space instance that it creates any symlinks we want. And so, paper space has this really nifty thing, which is you can create a file called .bash.local in storage and it will run that file every time you start a notebook.

And so, you'll see I've got a file there that goes through and creates a symlink to .ssh and to .local and to .git credentials and a bunch of stuff that we haven't talked about all of them yet and .caggle and symlinks them all to /storage. And so, this way, every time I start a new computer, I'm going to have all that stuff set up automatically, which is, yeah, I think is pretty great.

So, that's how you can customize your paper space instance. So, Jeremy, just to recap there to make sure I've got that clear in my head and for everyone else too. So, essentially, what you've done is that you've got this bash script that you keep inside your persistent storage, which contains all your config and bits and pieces that you want.

And then, every time you fire up a new instance, you're just symlinking all that stuff that you've got in storage to the machine you've just created. Yeah. And in particular, after I type pip install minus minus user something, it's created this .local directory and that's something that I want to be persistent.

So, I move that into storage and then symlink it back to where it's meant to be. Understood. Thanks. And the reason that you're doing this is because you can't get access to the root directory on their server. Like, would you need to do this on your own computer as well?

No, this is just for paper space. It's not that I can't access it, Mark. I can. I can install it. But the problem is, these are not real servers. That's not persisted. So, if I went in five hours' time when this shuts itself down and then I start up the server again, it's not there.

It's a mock server. It looks like it's your own server, but it doesn't actually keep your changes unless they're in business. This is necessary only on virtual machines, but on your own computer, you wouldn't need to do that. This is like just this one. This is just paper space.

This is just for paper space. And we're spending time talking about paper space because it's so much better than any other option out there for GPU servers. They're the only ones that have these nifty tricks. Yeah, on your own computer, you don't have to worry about any of this stuff.

And if you've got your own GPU, you certainly don't have to worry about it. But, you know, there are other notebook servers like Google Colab or whatever, but they don't have anything like this. So, on Google Colab, you're going to have to like reinstall everything you need every time you start up a new notebook and, you know, you won't have any of this proper environment.

So, yeah, as you might have seen, even my SSH keys are SIM-linked here. So, I'm always going to have my SSH keys any time I create a new paper space instance. So, yeah, this is like a super convenient way to have a free GPU server whenever you want it, which I think is pretty amazing.

Jeremy, a question. I followed what you did in terms of installing PIP installing the fast call. But then when I use Python and try to import fast call, it throws an error. But when I do, I Python and import fast call, it can find it. Does it? That's interesting.

Do you want to share your screen and we could try to be back there? I might have to stop sharing first. Let's see. Okay. I'll stop sharing. Should share now. Let me know, please. We're not seeing it yet. Here it comes. Okay. So, let's have a look. So, this is on paper space and you went PIP install.

Good. And you went Python. Interesting. Okay. So, great. So, press control D to exit from my Python. And you can press it again or hit enter. You didn't actually have to press Y. See how it's in square brackets. That means it's the default. So, you can just do that.

Okay. So, let's find out what's going on. So, type which Python. Okay. So, okay. And then type which I Python. I've got a strong suspicion. Try typing Python 3 instead of Python. Just type Python 3 or one word. Not which Python. Okay. Now, try importing fast call. Interesting. Let's see if I have the same problem on mine.

So, Python import fast call. Oh, I'm getting the same error on mine. Very interesting. Okay. I'm going to share my screen again. Very well spotted. So, this is exactly the kind of bug that I want us to have so we can learn how to hopefully fix it. I wonder if -- because I hardly ever just run Python.

And I've only recently started using pip install user because it's -- because of this paper space thing. So, I wonder if it's something specific to pip install user. So, let's see if we can debug this. Actually, what's interesting is no module named fast core is actually very interesting because that means it also doesn't have fast AI.

Which -- yeah. Okay. So, the way Python finds modules is a very similar idea of how bash finds executables. There's a path, basically. And so, in Python, there's a module called sys which is where all kinds of things are stored. And so, if we go sys. -- there's a sys.path.

Now, this is not the -- this is not the bash path environment variable. This is a totally separate thing with a similar name, which is an all lower case path, sys.path in Python. This is a list of places that Python will search for Python libraries. Now, if I import fast core, and you can see it's getting it from opt-condolib Python 3.7 site packages.

And you can see that is in my sys.path. So, that's how it's finding fast core. So, why isn't Python finding it? Well, we could do the same thing. Sys.path. So, that's interesting. So, Python here is not including site packages. Whereas, I Python is. So, there's something, I guess, about how paper space have installed things.

Because I'm pretty sure that's not what happens here. Let's try it. Python. Import sys.path. Yeah. So, here's site packages. So, let's see what happens if we -- site packages. So, this is like the normal place that PIP and Condo install things is into the site packages directory. And yeah, I've never really looked into it because I've never had problems accessing it before.

Oh, something to do with Debian puts things in dist packages. That's interesting. Site packages, not in path. So, let's try it. So, let's try it. >> Jeremy, why is this talking to me? Hang on, Jake. >> Just when you were looking at those two paths, one was 3.7 and one was 3.9.

>> Oh, I didn't even notice that. Is that true? You mean here? 3.9? >> Yeah. >> Oh, yeah. And 3.7. There you go. You're quite right. Thank you. Okay. So, that'll be the reason. Which Python? Which IPython? Yes. Okay. Yeah. All right. So, it wasn't just a case of typing Python 3.

It was a case of typing Python 3.9. There we go. Oh, still not there. Oh, it's 3.7 that IPython is using. Python 3.7. Okay. Thanks. That's exactly what it was. I don't know why they've got so many Python installed. It seems a bit like overkill. >> So, the Python 3.9 here was the system Python, right?

And the Python 3.9 was the point? >> I mean, because we're on paper space, I think they were all... Which Python? Which Python 3? They're actually all the ones in Conda. So, it's... So, paper space is installed. Conda is the root. And so, none of these are the system Python, actually.

Yeah, paper space is a bit unusual in that they have us run as root. So, things are a little bit confusing, actually. Yeah, now, as to why IPython is running 3.7, I'm actually not sure. So, something else that I do is I create a Git directory. And then I Git clone things into it using my SSH keys.

And then what I do is I move the Git directory into /storage and then Simlink it back. And actually, where I Simlink it to, I don't actually Simlink it to my home folder. I actually Simlink it inside /notebooks. And the reason for that is that that's where... That's where paper space uses as the root of its JupyterLab.

So, actually, you can see here I've done it before because it's in /storage, right? So, you can see here's my Git stuff. And so, I actually think, you know, I don't really want any of this stuff that they've put in here for me. So, actually, maybe I should try deleting.

In fact, let's try that. What happens if we... ...create a server and we make that Git repo thing empty because that's really what I want. So, you've uploaded your SSH keys into paper space, right? Yeah, I've uploaded them. And I've put them in /storage. And in my /storage/bashrc.local, I Simlink them into my home directory.

Correct. I'm not entirely paranoid about such things. Yeah. I mean, if you were paranoid about such things, then create a separate SSH key pair just for this and put that in your GitHub. So then, people... If somebody steals your SSH private key, the worst thing they could do is to get into your GitHub.

That's so cool. I didn't think about that. Wonderful. We'll do that. Thank you. All right. So, what would happen... That's a bit overkill for notebooks at the moment. Let's delete some of these. So, yeah, for me on paper space, you know, everything's kind of going into that /storage. So, I don't really care about deleting things.

All right. So, if I... Will it let me delete this? Because that's really what I want to do. So, I press delete. It's still showing me this. I don't know if that's a default or if it's just an example. Well, I'm here. So, I just want to mention maybe...

Sorry. Maybe I'm the only one. I understand in principle what you're talking about with the SSH keys and importing them and everything, but the details of the execution, if I'm the only one that's fine, I'll struggle with it. Let's do that. I don't know that I could actually do it.

Yeah, let's do it. That's excellent. Thank you. One thing I just want to do for my own interest is I'm just going to jump onto YouTube and see if anybody actually watches these live streams, because if they don't, I won't waste my time running them for people watching them.

Yeah, not sure it's worth it. Might just use Zoom in the future. Did you know your hands up, Radek, by the way? Yeah. You don't have to put your hand up. You can just talk. Okay. Okay. You know, some libraries, the more exotic ones, like I'm not sure maybe graphs this, or they require you to install something via updates to some library.

Oh, yeah. Let's talk about that as well. Great. Okay. So, this thing has successfully started to demachain. Let's see if there's anything in it. So, I was just starting the machine when deleted the Git repo thing. Yeah, okay, great. So, this is actually just empty. This is actually probably what I would be more inclined to do, although I expected to see my /git there.

Oh, wait. Okay. All right. Here's an interesting problem. That .bashrc.local file, it runs when you run a terminal. So, my git folder didn't appear until I actually opened a terminal. As soon as I did that, it appears. And I probably hadn't noticed that before because I always run a terminal as soon as I start pretty much.

There is a way actually that what they actually run when you start a notebook, when you start a server, is it actually runs this file, run.sh, which we can't change. But it does actually have a prerun.sh file, which is if you put stuff in /storage/prerun.sh, it will run before Jupyter starts, which maybe is actually a better place for all the stuff I'm doing.

Maybe that's what we should use instead of .bashrc.local, because this only runs when you run a terminal. Yes. Interesting. Let's try that. Actually, I'd forgotten. It looks like I have got local member stuff working as well. Maybe we can try that next time. So, by the way, to look at the end of a file, you can just type tail.

So, if I go tail/run.sh, there it is. So, if I move .bashrc so .bash.local to pre-run.sh. All right. Let's try that. So, if we now create a new notebook, if you're wondering why it is, by the way, that Paper Space is so perfectly set up for everything to work really well, it's because I've basically been nagging the perks of Paper Space for the last four years about all these things.

And actually, it's just really in the last three months that they actually really started listening. And I told them, "Put this here. Put this here. Then it's going to be great." So, yeah, they've been really great, particularly recently, at setting everything up exactly the way we need it. Okay.

So, delete that. And so, I think, yeah, see, here's that command, /run.sh. So, I guess what you could do, by the way, is you could, like, put some different, like, your own URL here. And it's going to, like, automatically put that in /notebooks. And maybe you could even put a shell script then that comes from GitHub.

I haven't really thought about that. Anyway. Okay. So, I think it was Mark that was asking, "How would I actually get my .ssh keys onto this machine?" I think the easy way to do it would be to use the upload file feature in JupyterLab. This is a really handy feature to know about.

So, you see this little button here, upload files. So, you could click that, and then you could go into your .ssh folder and find the files you want to upload and upload them. So, for example, I do config. And you can see here it appears. And so, then if I open my terminal, there it is there, right?

And then you could just move that to where you need it. One tip with .ssh keys, actually. In fact, let's do it from scratch. Because that's what I'm meant to be doing. Let's do it from scratch to make sure everything works. So, I'm going to rm.ssh. Okay. So, let's do it from scratch.

.ssh keys actually have to have very exact permissions on them. If it's possible for anybody else to read or write your .ssh keys, .ssh will refuse to use them. And so, one way to actually see the correct permissions is to create some .ssh keys. So, I could go .ssh-keygen.

Enter, enter, enter. And then I can go ls-la.ssh. And so, to remind you, we just briefly see this the other day, the permissions. These three here tell you this user, which is root, can they read, write, and execute the file. So, this user, so the root can read, write, and execute the, this is the private key file.

And it can also read, write, and execute the public, sorry, read and write the public key file. These three here is, what about everybody else? And this says everybody can read the public key file, but they can't do anything to the private key file. And then . refers to the current directory.

So, the directory itself, only the root user can read, write, and execute the directory. The idea of executing a directory might sound weird. It actually refers to seeing what is in a directory. They call executing a directory. So, let's upload my keys. Okay. So, there they are. Now, they're going to be put into /notebooks/git.

So, if I go cd.ssh, and then I'll move /notebooks/git/idrsa. Now, if I hit tab again, it'll show me that there's multiple things starting with those letters. If I say star, that refers to everything starting with those letters. So, I'm going to move all of those things into the current directory.

So, the current directory, remember, is .. So, .slash. And so, there they now are. And now, they don't have the right permissions anymore. My private key is readable by everybody, which is no good. So, to change permissions, we say chmod change. I don't know why it's called mod, rather than chperm or something.

And we can say that the group and the user should not have read permissions. So, the user and the group subtract read permissions on the private key. And then, check again. Oh, I shouldn't have said user and group. What I meant to say, it just removed permissions for myself to read it.

I should have said group and everybody, which I think is all. So, Jeremy, the first three dashes are for user. The next three dashes are for group. The first dash is for directory or not directory. But the next three dashes, yeah, go on. The next three dashes are for user.

The next three dashes are for group. And the last three dashes are for everyone. That's correct. Okay. That's what we want. So, now, the user can read and write the private key. And everybody, the user can read and write the public key. And everybody can read the public key.

So, we can test this by SSHing to GitHub.com. And GitHub.com expects you to log in with a username git. So, when you SSH before the @ sign, you say the username to log in as. And by default, it uses your current username, just root. I definitely can't log in to GitHub.com as root.

GitHub.com. Yes. Great. Hello, jph. So, it knows who I am, right? Because it knows who has my public key in that account. You've successfully authenticated. And then it closes it. Because you can't actually use a terminal on GitHub.com. It's only used for Git. But you can see my key is working.

Wouldn't it be simpler or am I missing something to generate a new key in paper space rather than import it and then just give GitHub that new key? Maybe. I don't know. I'm just thinking with all these changing of permissions and stuff. I'm going to say, like, okay. So, obviously, I don't think so because I don't do it that way.

But if I think about why I don't do it that way, like, some people do it your way. Your way is in many ways more correct in that you would have different public keys on GitHub.com for every machine you're using. And if somebody, like, stole a machine, you could delete just that public key.

And that person now couldn't log in. But you could still log in. And maybe that's more convenient or something. It's a perfectly fine way to do it, Mark, honestly. I don't like the mental overhead of having to think about having multiple keys and which is which. I've had a GitHub account for quite a long time and probably used, I don't know, maybe 100 different machines to access it.

And I don't like the idea of having 100 public keys and thinking where are they and should they still be there. But, yeah, I think it's fine. All right. So, that was actually pretty intense today. So, for folks who, you know, are just getting started, there was nothing we used today I don't think that we haven't learned how to use before.

But it's tough using things that you've only just learned about. And so, therefore, you know, it does need a lot of practice. So, if you're kind of new to this, then, yeah, then, like, you probably want to rewatch the video and, like, also pepper me with questions next time.

If you try things and it doesn't work. Or you're not sure why we do it or whatever. All right. Anything else before we? Yeah. Yeah. What about these things you have to start? Oh, yeah. Okay. Let's do that next time. Yeah. Let's do that next time. I will put it on the forum.

Thanks, so nice to see you all. Thank you so much. Bye. Thank you. Bye. Thank you. Bye.