Python Environment Setup for Machine Learning

Hello and welcome to this video on how to set up a Python environment specifically for machine learning. So this is often an overlooked part of machine learning and there's not that many tutorials out there on how to do this properly. So I thought it'd be a good idea to just go through this and explain exactly how I set up my environment.

So you can see here we have Jupyter and first thing you might notice is that I have these different environments. You have the default Python 3 base environment and then I also have this GCP which I use for the cloud and then I also have this one which is my machine learning environment.

Now the difference between each of these is the machine learning environment specifically has packages in Python for machine learning like TensorFlow, PyTorch, Transformers, Pandas, NumPy. It has all of those packages but nothing else so there's no excess baggage if you like. So I wanted to just go through and explain how to actually set this up.

So we're going to close this Jupyter notebook. I'm going to open this new Anaconda prompt here. So I'm assuming that you have already installed Python and that you are using the Anaconda distribution. So if you are not using this you can download it over from here. You can head over to Anaconda.com and you just click on products and individual edition over here.

And just download. Okay so the installation for Anaconda is pretty simple if you're on Windows. It's a little different if you're on Linux and I don't know how it is on Mac but generally it's pretty straightforward and if you do need any help with it you can find out quite quickly.

So once we have that installed we want to go over to our Anaconda prompt and to make sure that we have installed it correctly we just want to type python -v and this will show us the version of Python that we have. So I'm at the moment using Python 3.8.3.

Just make that a little bigger. And okay if that works that's good. So at the moment we're using the core base environment that you can see here and that is just the default environment that gets installed whenever you install the Anaconda distribution. But what we want to do is actually create a new environment which is our machine learning environment.

So to do that we use the syntax like this. So we conda create. Conda is just referring to Anaconda or name. So you can also write this as name or again with n. And then you want to enter your environment name here and then you would also write Python and your Python version.

And at the end of that you would also type Anaconda. So for us I'm going to use a environment name of mln like that and I also want to be installing Python 3.8. And that should be everything. So we'll just enter and now Python will work through and actually install that.

So I've already installed a mln before but I uninstalled it so it's coming up with this warning saying it already exists. But I'm going to continue creating the environment because I want to reinstall it. So I put yes. You shouldn't see that on yours. And then this will take a little bit of time just to get everything together.

Okay so now we are just shown a list of all the packages that will be installed. So we just want to accept that. So press yes and enter. And that will go ahead and install all of those. Okay so everything is set up. Now we can switch over to our new environment.

So at the moment we're in base. We can switch over to our new environment with conda activate and the environment name which in our case is mln. So let's go ahead and do that. And now you can see that the name here is switched to mln which is our new environment.

Now we just need to install our machine learning packages. So we're going to go ahead and install the basics. So we have pandas and matplotlib. We're going to install both of those with a conda install. So we have two options here. We have conda or pip to install our packages.

Generally conda will most likely integrate with your environment better. So it's usually a good idea to try that first. If that doesn't work then try pip install. So we'll go ahead and conda install. We're going to do pandas and matplotlib. Then add anything else here that you feel that you might also need.

But this is all we're going to go with. So it's a good point noting that we also need numpy but numpy is included as a dependency of pandas so we don't need to explicitly mention numpy here. And that will go ahead and ask us for permission to install the packages that it finds.

We click yes and then we go ahead with the installation again. Now we can go ahead and install fenceflow, transformers and pandas. And pytorch which are all machine learning frameworks. So tenseflow we can install it quite easily. All we need to do is conda install tenseflow. Okay so now we have the yes or no from tenseflow.

Okay so tenseflow is now installed so we can now go ahead and install transformers. So transformers we are going to use pip because conda doesn't recognize the most recent versions of the transformers library at the time of recording at least. So we have to use pip to get the most recent versions.

So we pip install transformers. There we go. And finally we have pytorch which is slightly more complex but we make it quite easy by just taking a look at the pytorch start locally guide which you can find here. So pytorch.org get started locally and all we do is we come down to the start locally bit and we select our pytorch build.

So this is the stable release and this is like a beta release which gets released more often but it's more like to have bugs and errors in. So I think most people will probably want to avoid this. You can choose your OS so for me it's windows. Package manager so that is conda you can also use pip but I would recommend conda because it will install the dependencies we need as well.

We're using python and then this bottom one here refers to cuda. So we use cuda as the gpu acceleration library so essentially with this if you have a nvidia gpu cuda lets you use it speed up any machine learning tasks that you have in either pytorch or tensorflow. So you can read tensorflow's gpu setup guide if you do have a gpu this is quite useful so you just head on down to the bottom here or if you're on linux this guide is always quite useful.

And then we have the windows setup here so all you need to do is install all of these which is reasonably straight forward but there are a lot of good guides out there if you do need help with it. And then you just head on down and set your paths so that tensorflow slash pytorch can actually see cuda.

Another useful guide as well is this nvidia cuda installation guide which can be quite useful as well. Now I would recommend using cuda 10.2 at the time of recording so unless you're using the latest rtx 30 series so that is the nvidia geforce rtx 3090, 3080 and I think it's 3070.

So the support for those is a little bit sketchy at the moment and you will actually need cuda 11 alongside the nightly builds of pytorch and tensorflow so this is what I mentioned over here. That's a little bit more difficult and I'm not going to be covering that here but again there are a lot of good guides out there if you do need help with it.

So if you don't have a gpu or you just don't care about gpu acceleration you just click none and it will change the command down here which we'll be using for our installation. So I'll be using this command here. So we're doing a conda install and then we have a few packages not just pytorch here.

So pytorch, torchvision, torch audio, cuda toolkit and first we are using a 10.2 and then we set our channel to pytorch as well. Now we can go ahead and install that. So just select yes again and now that is our environment completely set up. So all we need to do now is actually add this environment to Jupyter.

So remember at the start we had that little box and we had python3 gcp and ml so we're gonna add a new one called ml environment. So to do that we need to install ipykernel and with that ipykernel we are going to install our new environment. So we do that by specifying the name of it here mln and then we also want to set the display name.

So this is the name that we will see when we enter into JupyterLab and we can have that box. So this can be anything you want. So for me I'm just going to put ml environment. And we just run that again. Okay so that is ready and now we can just go ahead switch back to our base environment which is our default environment and now just open up JupyterLab.

And we can see here we now have this other ml environment and this is the one that we just created. So that is it for this short video. I hope it's been useful and I will see you again in the next one. Thanks for watching. Bye.

Python Environment Setup for Machine Learning

Chapters

Transcript