back to index

How to Build Python Packages for Pip


Chapters

0:0
1:33 Directory Structure
5:17 Classifiers
8:7 License
14:22 Open Binary

Whisper Transcript | Transcript Only Page

00:00:00.000 | Okay, so we're going to work through actually building a Python package that we can install
00:00:06.400 | through pip. So when we're installing a package through pip, we're actually installing a package
00:00:12.960 | from this website here, the Python package index. So if we just search for something we all know,
00:00:19.440 | pandas, we'll see here that we actually have all these different pandas projects. Now the reason
00:00:25.680 | there's so many of these is because loads of people upload projects that say pandas in the
00:00:31.360 | name as you can see. The one that we use and we know is this pandas 1.2.3 and we head on here
00:00:38.800 | and we can actually see all of the information about that package. Now what we want to do is
00:00:45.360 | actually build our own package and upload it to this so that we can pip install it and build it
00:00:50.480 | in a way that other people can use it without any issues. So to start I am going to use this package
00:00:58.560 | that I put together quickly for this and what it does is it simply generates a ascii visual based
00:01:08.080 | on a few images that I've included in here. So these synthwave visuals and all this package does
00:01:18.000 | is you initialize it and you press generate and it will generate ascii versions of these pictures at
00:01:26.320 | random. So it's pretty useless but it's a good example. Now the first thing we want to look at
00:01:32.800 | here is the directory structure. So we have this aesthetic ascii directory which is like our
00:01:39.600 | project directory, we can name this as we want and then inside here we actually have the package
00:01:45.360 | directory. So this is where we put all of our code and in here we have this init python file.
00:01:52.400 | Now this is very important we need to have these files in order for our directories to be recognized
00:01:58.880 | as python modules or python packages and then inside here we also have this synthesize py file
00:02:06.560 | and this is where all the code is going for me and this creates a module within the package
00:02:11.760 | that we can import. Now you notice here my init file is empty and this is something you'll see
00:02:16.960 | quite a lot. It's just there to tell our package builder which directories are parts of our python
00:02:23.600 | package or python sub packages. And then of course we have the readme file here. So we can summarize
00:02:31.120 | this as this we have on our top level we have the project directory and then we have the readme file
00:02:38.480 | and then we have our code directory. Inside there we need to have the init file and then we in this
00:02:44.240 | example we have two modules. We have module a module b. Now when it comes to actually importing
00:02:51.520 | one of these modules and using it in our code what we would do is we'd write something like this. So
00:02:56.960 | we'd have from and in this case it's called code directory. Obviously we wouldn't call our package
00:03:07.040 | code directory unless you really want to and then we would import a for the a module or b
00:03:17.520 | for the b module. So in our case this is going to be replaced with aesthetic ascii and this is
00:03:26.560 | going to be replaced with synthesize and we'll come to actually testing that out later. But for now
00:03:32.560 | let's just work through the other files that we need in our directory. I don't want to focus
00:03:37.520 | on the code so much here. So we have our readme file and we have our code directory. The other
00:03:43.440 | files that we need are all configuration setup files and the most important of those for setting
00:03:48.880 | up our package configuration is the setup config file. So that looks like this and if we go ahead
00:04:01.200 | and actually open this file. So we'll need to create it here. So setup.config and there are
00:04:10.800 | a few parameters in here that we'll need to add. Now for this project we will be using these. So
00:04:16.800 | we have metadata which is just the description or the data explaining what our package is actually
00:04:25.200 | doing. So this sort of thing will be displayed on the python package index and people will be able
00:04:31.920 | to see the name of it, the version, the author. They'll be able to get in touch with you through
00:04:36.160 | your email and they'll also find a link to your website. So for me obviously I'm just using github
00:04:43.120 | here and all of those things are pretty self-explanatory but there are a few that I think
00:04:48.800 | are a bit more abstract. So first we have the long description and you can write something here
00:04:55.200 | but it's a long description so I find it's better to include a file and you include your readme
00:05:01.520 | file. Okay and then this will display on the front page of your python package index site.
00:05:07.600 | And then we also need to explain that our long description is a marked down text file so that
00:05:15.200 | it is processed correctly. Next we have our classifiers and these are kind of like types.
00:05:20.240 | So we're specifying the programming language that we're using it's python, it's python 3.
00:05:25.520 | We are outlining the license that we're using and for that we have this oc approved and we're using
00:05:32.080 | an mit license which we'll cover pretty soon. And I want to specify the operating system and for us
00:05:38.080 | it's independent you don't have to have a specific operating system to run this. So that's just the
00:05:44.000 | metadata behind our package and then down here we have some of the more important things for
00:05:50.800 | actually building a package. So here we're specifying the packages that will be included
00:05:56.000 | as dependencies and we use this find here and what this would do it will go through our code here
00:06:03.520 | and find any package dependencies that we have. And this is provided by self tools which is what
00:06:11.680 | we're using and I'll explain that in a moment. We also want to specify that we're using python
00:06:16.880 | and we require either 3.7 or greater and the reason that we need 3.7 or later is because
00:06:25.440 | we're using a package that came with python 3.7 in order to include all of our resources over here.
00:06:32.480 | So that's our dependency on that version. And then here we're including package data which again
00:06:40.960 | we are including because we have all these extra files that are not code files that we need to have
00:06:45.680 | included within our package and that's why we've added that in there. So that's our setup config
00:06:52.480 | file and alongside that we also need a pyproject.toml file. And I mentioned before in the
00:07:07.440 | setup config that we're using setup tools and in order to tell our package builder which we'll use
00:07:15.120 | later to use setup tools in the building process we need to specify this pyproject.toml file.
00:07:21.840 | So we'll create that file here pyproject.toml and this is the code that we'll need for it.
00:07:32.320 | So we're saying here for the build system it will require setup tools which is the library
00:07:38.320 | we use to actually set up and process the setup.config file and then python wheel. And then
00:07:44.400 | here we're just specifying the build process that we'll be using from setup tools which is buildmeta.
00:07:50.160 | Okay and that's all we need and that is just read by our builder and it tells it how to build
00:07:56.240 | our package which essentially tells it to use setup tools and process this and follow the
00:08:02.720 | instructions that this file is giving you. And next up is our license. Now the license is quite
00:08:12.800 | important and it's very important that we actually include a license file because if we don't include
00:08:20.000 | a license file we can get into a lot of issues in terms of people not knowing if they're actually
00:08:25.520 | allowed to use our code or not. And especially if you have people contributing to your project
00:08:32.240 | if they contribute and you still don't have a license file then it's quite odd because legally
00:08:38.800 | you're in a sort of gray zone as to whether you can actually use the package that you built. So
00:08:43.840 | a license is very important to include but it's very very easy to actually set that up. So all
00:08:51.440 | we need to actually do is we go on to this website called chooselicense.com and we just select what
00:08:58.640 | we want for our license. So for me I want it to be simple and permissive so anyone can use it
00:09:03.920 | and it gives you a license here to just copy and paste across. So I'm using the MIT license,
00:09:11.840 | go back into project and we just create a new file and it's called license like that.
00:09:19.840 | And we just copy that across, change this so it's 2021 James Brooks and that's the license
00:09:28.960 | sorted so it's pretty easy. Now this is all great and we have our code but there is one modification
00:09:36.000 | to our code we need to make in order for this to work. So in our code at the moment we rely on
00:09:44.240 | reading these images which are in the same folder at the moment. Now of course we could have put
00:09:50.000 | these in a different folder, we probably would normally, but when we are building a package
00:09:55.120 | we need any resources that we want to include in that package to be contained within a package
00:10:01.600 | directory which includes a init.py file. So that's why they're all included in here. Now we have a
00:10:08.560 | few images and we also have this font file which is a true type font and what we do down here
00:10:16.640 | usually is that we use the pill library and we just open the image like this. So we're selecting
00:10:23.120 | one at random, so it's one of these at random and we're just opening that and it's pretty
00:10:28.640 | straightforward. Now when we build a package this won't work anymore because this is a relative path
00:10:37.120 | so you will almost certainly get issues. So what we instead need to do is include
00:10:44.080 | these files within our package. So we've already done part of that we've included them in this init
00:10:51.440 | file but there is one more step we need to include another file here which is called manifest.in
00:11:01.760 | and this manifest.in file needs to include any resources that are not code files that we would
00:11:09.520 | like to include within our package. So to do that we need to write include and then here we give the
00:11:17.920 | name of our python package or sub package so it's ascetic ascii and this is referring to this
00:11:26.320 | directory name here and inside there we want to include all so it's wildcard for all files that
00:11:33.040 | end with png which is our image files. Now you will notice that we also have some jpeg files as well
00:11:40.640 | so we'll just copy that and we'll add jpeg as well. Now that's our image files but we also have our
00:11:49.040 | true type font file so we add that in as well and that's our manifest file. Now back in the config
00:11:58.240 | we know that we've included the manifest config file through this include package data at the
00:12:04.720 | bottom so if this was false this would not be read in and our extra resources would not be included.
00:12:10.560 | With it being true it knows to expect a manifest file and then it reads this and includes these
00:12:15.840 | within our package. So that's great and then in our code we need to use a different approach to
00:12:22.720 | actually opening these images or other files and to actually do that we need two different
00:12:29.760 | libraries. In this case we need import lib and from import lib we need to import resources
00:12:40.560 | and alongside import lib resources we also need to import the io library.
00:12:46.080 | So this import lib resources is specifically for reading in resources or files that we've
00:12:54.240 | included within a package. The io library is for converting the bytes that we will read when we
00:13:00.560 | import the resources into readable file-like objects because the code that we use down here
00:13:08.000 | reads our images and font from file so this allows us to pretend that the bytes that we've read into
00:13:14.560 | our python code is actually a file rather than a set of bytes. So with that this code here this line
00:13:22.880 | instead becomes this which is a little more complicated but this is just what we have to use.
00:13:29.440 | So here we're using the resources that we imported and we're using open binary it's
00:13:35.360 | in the aesthetic ascii package so we need to include that that is the directory name here
00:13:40.640 | and then this time we're using the same thing again we're just using the image and we're
00:13:45.360 | selecting one at random and then we're just importing or reading that like with any other
00:13:50.320 | file and then when we use this we would expect a file path to be passed to this function so we
00:13:56.240 | imitate a file path by using this io bytes io function and passing the image binary to that.
00:14:03.520 | Now that's our images that's sorted and then we also have our font down here as well okay so at
00:14:10.560 | the moment what we have we have our font path and this is just reading from file so we've got this
00:14:16.880 | image font true type and then we're reading it in there. Now we're going to be using the same
00:14:21.520 | thing again we're going to be using the open binary function so we do with resources open binary
00:14:30.800 | again we include our aesthetic ascii package name and we also include the font path
00:14:38.080 | which we've just defined there and we again read that in just like we did before so we'll call it
00:14:44.880 | font we do fp.read and then we need to initialize that font and what we're doing here was reading
00:14:51.840 | from path before so again we need to use the io library to imitate that so we do io bytes io
00:15:01.280 | and font and with that all of that will now work so that's everything set up and what we do once
00:15:10.320 | we have it all set up is we first test site works locally so the first thing we need to do is
00:15:17.200 | navigate to that directory so for me it's in documents projects test aesthetic ascii
00:15:28.880 | and once you're in there we can just do pip install dot and this will do a local pip install
00:15:36.080 | from the current directory okay so you can see that that worked that's great and then here you
00:15:42.480 | probably want to just test your new imported package so for example for me i would go from
00:15:51.840 | aesthetic ascii import synthesize and then i would go through the code for actually testing
00:16:01.920 | i'm not going to do it here but as long as that all works well then we should be good to go so
00:16:09.760 | at this point we do python module build and this package this module will actually build our
00:16:18.400 | package and what you will find in the directory here is that we'll get a new directory called
00:16:23.920 | dist which means distribution and this will contain two files so we can have a look at the
00:16:31.200 | moment we have the tar file and now we have the wheel file as well now these are two files that
00:16:38.080 | are going to be uploaded to the python packaging index and they are our package so after we've done
00:16:44.960 | that we need to upload our code the first thing we do before we upload it straight to the python
00:16:50.960 | packaging index is we upload it to the test version of the python packaging index just to make sure
00:16:56.400 | that everything is working and we find that over at testpyp.org and of course you will need to
00:17:05.360 | register here so you go through register enter everything you need you will need your username
00:17:10.240 | and password when you're running through the upload process in your prompt window
00:17:15.200 | so make sure you have that set up and we'll also need to pip install twine which is a package that
00:17:25.280 | will allow us to upload everything to python packaging index now i already have it installed
00:17:30.320 | so i won't go ahead and do that instead what i'll do is python module twine upload repository
00:17:39.040 | test python packaging index and here we need to pass everything within our distribution folder
00:17:49.920 | and we run that and everything will be uploaded to our python packaging index once we've done that
00:17:58.320 | we want to actually install our package but from the test python packaging index so to do that
00:18:07.520 | first because we already installed our package we actually want to uninstall it
00:18:12.480 | and just uninstall that and then we can install it again but through the test package index so to do
00:18:26.400 | that we just pip install and then we set the install index to the test python packaging index
00:18:36.400 | so we just do test pyp.org so that's simple and then we specify our actual package name
00:18:46.320 | which is pathetic ascii
00:18:49.200 | and we should find that that will actually run and install and then at this point this is where
00:18:58.400 | we want to test again now if you are using resources and everything with your package
00:19:03.840 | that is something to definitely check that everything is being read correctly and it's
00:19:07.520 | just functioning now once you're happy that everything is working correctly this is where
00:19:13.920 | we finally upload it to the actual python packaging index now this is the same process again
00:19:20.400 | so we do python module using twine we are using upload and we also type repository
00:19:30.640 | now this time instead of writing test pyp we write pyp and again we use our distribution
00:19:38.960 | we execute that it'll ask you to log in and then we're done and we'll be able to see
00:19:44.000 | our python package over here so we'll be able to search for it or you can just come over here to
00:19:51.760 | if you've logged in click on your projects and you'll find here we have our actual package
00:19:58.480 | now i can view that and it will come up with the readme file on the side here we have all of our
00:20:04.640 | metadata so we have the mit license the author and that includes a email link there as well
00:20:10.960 | which python version it requires the language and so on we can also click on the home page
00:20:17.520 | which will take us through to the github repository so that's all set up as we would
00:20:23.760 | want it to which is pretty cool and most importantly we can actually just pip install
00:20:30.080 | it and start using it so first i'll just uninstall
00:20:34.240 | and now all we need to do is pip install our package and it's that easy so that's everything
00:20:50.480 | that you need to know for actually building your own python package it's a really cool
00:20:54.880 | thing to be able to do and i would definitely recommend learning how to do this and building
00:21:00.720 | your own packages so thank you very much for watching and i will see you again in the next one