back to index

Python Environment Setup for Machine Learning


Chapters

0:0 Intro
1:58 Environment Setup
4:10 Installing Packages
6:59 Installing PyTorch

Whisper Transcript | Transcript Only Page

00:00:00.000 | Hello and welcome to this video on how to set up a Python environment specifically for machine
00:00:06.240 | learning. So this is often an overlooked part of machine learning and there's not that many
00:00:11.280 | tutorials out there on how to do this properly. So I thought it'd be a good idea to just go through
00:00:16.480 | this and explain exactly how I set up my environment. So you can see here we have Jupyter
00:00:22.960 | and first thing you might notice is that I have these different environments. You have the default
00:00:29.360 | Python 3 base environment and then I also have this GCP which I use for the cloud and then I
00:00:35.360 | also have this one which is my machine learning environment. Now the difference between each of
00:00:42.080 | these is the machine learning environment specifically has packages in Python for machine
00:00:47.840 | learning like TensorFlow, PyTorch, Transformers, Pandas, NumPy. It has all of those packages but
00:00:55.040 | nothing else so there's no excess baggage if you like. So I wanted to just go through and explain
00:01:02.240 | how to actually set this up. So we're going to close this Jupyter notebook.
00:01:07.600 | I'm going to open this new Anaconda prompt here. So I'm assuming that you have already installed
00:01:16.000 | Python and that you are using the Anaconda distribution. So if you are not using this
00:01:23.840 | you can download it over from here. You can head over to Anaconda.com
00:01:30.640 | and you just click on products and individual edition over here.
00:01:36.480 | And just download. Okay so the installation for Anaconda is pretty simple if you're on Windows.
00:01:45.680 | It's a little different if you're on Linux and I don't know how it is on Mac but generally it's
00:01:52.000 | pretty straightforward and if you do need any help with it you can find out quite quickly.
00:01:56.480 | So once we have that installed we want to go over to our Anaconda prompt and to make sure that we
00:02:06.720 | have installed it correctly we just want to type python -v and this will show us the version of
00:02:13.200 | Python that we have. So I'm at the moment using Python 3.8.3. Just make that a little bigger.
00:02:22.400 | And okay if that works that's good. So at the moment we're using the core base environment that
00:02:30.880 | you can see here and that is just the default environment that gets installed whenever you
00:02:37.760 | install the Anaconda distribution. But what we want to do is actually create a new environment
00:02:45.120 | which is our machine learning environment. So to do that we use the syntax like this. So we
00:02:53.760 | conda create. Conda is just referring to Anaconda or name. So you can also write this as name or
00:03:01.200 | again with n. And then you want to enter your environment name here and then you would also
00:03:09.760 | write Python and your Python version. And at the end of that you would also type Anaconda. So
00:03:19.680 | for us I'm going to use a environment name of mln like that and I also want to be installing Python
00:03:30.720 | 3.8. And that should be everything. So we'll just enter and now Python will work through and
00:03:41.360 | actually install that. So I've already installed a mln before but I uninstalled it so it's coming
00:03:50.480 | up with this warning saying it already exists. But I'm going to continue creating the environment
00:03:55.040 | because I want to reinstall it. So I put yes. You shouldn't see that on yours.
00:04:01.680 | And then this will take a little bit of time just to get everything together.
00:04:08.080 | Okay so now we are just shown a list of all the packages that will be installed.
00:04:15.120 | So we just want to accept that. So press yes and enter.
00:04:22.480 | And that will go ahead and install all of those. Okay so everything is set up. Now we can
00:04:28.880 | switch over to our new environment. So at the moment we're in base. We can switch over to our
00:04:34.800 | new environment with conda activate and the environment name which in our case is mln.
00:04:40.240 | So let's go ahead and do that.
00:04:49.840 | And now you can see that the name here is switched to mln which is our new environment.
00:04:55.200 | Now we just need to install our machine learning packages. So we're going to go ahead and install
00:05:03.040 | the basics. So we have pandas and matplotlib. We're going to install both of those with a
00:05:08.720 | conda install. So we have two options here. We have conda or pip to install our packages.
00:05:14.640 | Generally conda will most likely integrate with your environment better. So it's usually a good
00:05:20.400 | idea to try that first. If that doesn't work then try pip install. So we'll go ahead and
00:05:27.200 | conda install. We're going to do pandas and matplotlib.
00:05:30.800 | Then add anything else here that you feel that you might also need. But this is all we're going
00:05:41.920 | to go with. So it's a good point noting that we also need numpy but numpy is included as a
00:05:48.800 | dependency of pandas so we don't need to explicitly mention numpy here. And that will go ahead and
00:05:56.240 | ask us for permission to install the packages that it finds. We click yes and then we go ahead
00:06:03.200 | with the installation again. Now we can go ahead and install fenceflow, transformers and pandas.
00:06:11.200 | And pytorch which are all machine learning frameworks. So tenseflow we can install it
00:06:18.960 | quite easily. All we need to do is conda install tenseflow. Okay so now we have the yes or no
00:06:27.360 | from tenseflow. Okay so tenseflow is now installed so we can now go ahead and install transformers.
00:06:37.360 | So transformers we are going to use pip because conda doesn't recognize the most
00:06:43.040 | recent versions of the transformers library at the time of recording at least.
00:06:48.000 | So we have to use pip to get the most recent versions. So we pip install transformers.
00:06:56.960 | There we go. And finally we have pytorch which is slightly more complex but we make it quite easy
00:07:06.880 | by just taking a look at the pytorch start locally guide which you can find here.
00:07:13.280 | So pytorch.org get started locally and all we do is we come down to the start locally bit
00:07:23.040 | and we select our pytorch build. So this is the stable release and this is like a beta release
00:07:30.800 | which gets released more often but it's more like to have bugs and errors in. So I think most people
00:07:36.560 | will probably want to avoid this. You can choose your OS so for me it's windows. Package manager
00:07:44.560 | so that is conda you can also use pip but I would recommend conda because it will
00:07:49.440 | install the dependencies we need as well. We're using python and then this bottom one here
00:07:57.920 | refers to cuda. So we use cuda as the gpu acceleration library so essentially with this
00:08:08.400 | if you have a nvidia gpu cuda lets you use it speed up any machine learning tasks that you have
00:08:15.280 | in either pytorch or tensorflow. So you can read tensorflow's gpu setup guide if you do have a gpu
00:08:24.720 | this is quite useful so you just head on down to the bottom here or if you're on linux this guide
00:08:32.800 | is always quite useful. And then we have the windows setup here so all you need to do is
00:08:40.000 | install all of these which is reasonably straight forward but there are a lot of good guides out
00:08:48.480 | there if you do need help with it. And then you just head on down and set your paths so that
00:08:55.840 | tensorflow slash pytorch can actually see cuda. Another useful guide as well is this nvidia cuda
00:09:03.440 | installation guide which can be quite useful as well. Now I would recommend using cuda 10.2
00:09:10.400 | at the time of recording so unless you're using the latest rtx 30 series so that is the nvidia
00:09:20.080 | geforce rtx 3090, 3080 and I think it's 3070. So the support for those is a little bit sketchy at
00:09:29.200 | the moment and you will actually need cuda 11 alongside the nightly builds of pytorch and
00:09:36.720 | tensorflow so this is what I mentioned over here. That's a little bit more difficult and I'm not
00:09:43.680 | going to be covering that here but again there are a lot of good guides out there if you do
00:09:48.560 | need help with it. So if you don't have a gpu or you just don't care about gpu acceleration you
00:09:55.520 | just click none and it will change the command down here which we'll be using for our installation.
00:10:02.560 | So I'll be using this command here. So we're doing a conda install
00:10:08.400 | and then we have a few packages not just pytorch here. So pytorch, torchvision,
00:10:21.360 | torch audio, cuda toolkit and first we are using a 10.2 and then we
00:10:31.920 | set our channel to pytorch as well. Now we can go ahead and install that.
00:10:38.800 | So just select yes again and now that is our environment completely set up.
00:10:47.120 | So all we need to do now is actually add this environment to Jupyter. So remember at the start
00:10:52.240 | we had that little box and we had python3 gcp and ml so we're gonna add a new one called ml
00:10:59.040 | environment. So to do that we need to install ipykernel and with that ipykernel we are going
00:11:19.280 | to install our new environment. So we do that by specifying the name of it here mln
00:11:29.200 | and then we also want to set the display name. So this is the name that we will see when we
00:11:34.320 | enter into JupyterLab and we can have that box. So this can be anything you want.
00:11:41.840 | So for me I'm just going to put ml environment.
00:11:53.120 | And we just run that again. Okay so that is ready and now we can just
00:11:59.360 | go ahead switch back to our base environment which is our default environment
00:12:05.120 | and now just open up JupyterLab.
00:12:16.080 | And we can see here we now have this other ml environment and this is the one that we just
00:12:23.120 | created. So that is it for this short video. I hope it's been useful and I will see you again
00:12:30.400 | in the next one. Thanks for watching. Bye.