Distribution agnostic Python project packaging

Hi all ! Ive heard some people saying it was rude to post on a mailing list without introducing yourself so here goes something: my name is James Pic and I've been developing and deploying a wide variety fof Python projects Python for the last 8 years, I love to learn and share and writing documentation amongst other things such as selling liquor. The way I've been deploying Python projects so far is probably similar to what a lot of people do and it almost always includes building runtime dependencies on the production server. So, nobody is going to congratulate me for that for sure but I think a lot of us have been doing so. Now I'm fully aware of distribution specific packaging solutions like dh-virtualenv shared by Spotify but here's my mental problem: I love to learn and to hack. I'm always trying now distributions and I rarely run the one that's in production in my company and when I'm deploying personal projects I like funny distributions like arch, Alpine Linux, or interesting paas solutions such as cloudfoundry, openshift, rancher and many others. And I'm always facing the same problem: I have to either build runtime dependencies on the server, either package my thing in the platform specific way. I feel like I've spent a really huge amount of time doing this king of thing. But the java people, they have jars, and they have smooth deployments no matter where they deploy it. So that's the idea I'm trying to share: I'd like to b able to build a file with my dependencies and my project in it. I'm not sure packaging only Python bytecode would work here because of c modules. Also, I'm always developing against a different Python version because I'm using different distributions because it's part of my passions in life, as ridiculous as it could sound to most people, I'm expecting at least some understanding from this list :) So I wonder, do you think the best solution for me would be to build an elf binary with my Python and dependencies that I could just run on any distribution given its on the right architecture ? Note that I like to use Arm too, so I know I'd need to be able to cross compile too. Thanks a lot for reading and if you can to take some time to share your thoughts and even better : point me in a direction, if that idea is the right solution and I'm going to be the only one interested I don't care if it's going to take years for me to achieve this. Thanks a heap ! Beat regards PS: I'm currently at the openstack summit in Barcelona if anybody there would like to talk about it in person, in which case I'll buy you the drinks ;)

On Thu, Oct 27, 2016 at 11:50 PM, James Pic <jamespic@gmail.com> wrote:
In theory, you could do that. You'd have to include *all* of Python, and all of everything else you might depend on, because you can't be sure what is and isn't available, so you might as well ship your app as a VM image or something, with an entire operating system. In practice, you're probably going to need to deal with some sort of package manager, and that's where the difficulties start. You can probably cover most of the Debian-based Linuxes by targeting either Debian Stable or Ubuntu LTS and creating a .deb file that specifies what versions of various libraries you need. There's probably a way to aim an RPM build that will work on RHEL, Fedora, SUSE, etc, but I'm not familiar with that family tree and where their library versions tend to sit. The trouble is that as soon as you land on an OS version that's too far distant from the one you built on, stuff will break. Between the bleeding-edge rolling distros and the super-stable ones could be over a decade of development (RHEL 4, released in 2005, is still supported). What you can probably do is ignore the absolute bleeding edge (anyone who's working on Debian Unstable or Arch or Crunchbang is probably aware of the issues and can solve them), and then decide how far back you support by looking at what you depend on, probably losing the very oldest of distributions. It should be possible to hit 95% of Linuxes out there by providing one .deb and one .rpm (per architecture, if you support multiple), but don't quote me on that figure. Unfortunately, the problem you're facing is virtually unsolvable, simply because the freedom of open systems means there is a LOT of variation out there. But most people on the outskirts are accustomed to doing their own dependency management (like when I used to work primarily on OS/2 - nobody supports it much, so you support it yourself). With all sincerity I say to you, good luck. Try not to lose the enthusiasm that I'm hearing from you at the moment! ChrisA

On Thu, Oct 27, 2016 at 8:50 AM, James Pic <jamespic@gmail.com> wrote:
Are you sure this is really what you need to do? With dependency handling, you can define the dependencies of your project and they will automatically get installed from pypi when the user tries to install the package (if they aren't already installed). manylinux wheels [1] allow you to distribute your own code in a manner that is compatible with most linux distributions, and many c-based projects now offer such wheels. Assuming your dependencies have version agnostic wheels (either manylinux or pure python), what would be the advantage to you of putting everything together in a single file? That being said, I suppose it would be possible to create your own manylinux wheels that include all the necessary dependencies, but that would make building more difficult and opens up the possibility that the installed modules will conflict with users' existing installed packages. Another possibility would be to use docker to create a container [2] that includes everything you need to run the code in an isolated environment that won't conflict [1] https://github.com/pypa/manylinux [2] https://www.digitalocean.com/community/tutorials/docker-explained-how-to-con...

OT, but.... Assuming your dependencies have version agnostic wheels (either manylinux
which is why conda exists -- conda can package up many things besides python packages, and you WILL need things besides python pacakges. building conda packages for everything you need, and then creating an environment.yaml file, and you can create a consistent environment very easy across systems (even across OSs) There is even a project (I forgot what its called -- "collections" maybe?) that bundles up a bunch of conda packages for you. and with conda-forge, there are an awful lot of packages already to go. If you need even more control over your environment, then Docker is the way to go -- you can even use conda inside docker... -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Oct 27, 2016 at 02:50:52PM +0200, James Pic wrote:
Your question is off-topic for this list. This list is for proposing new features for the Python language, but you don't seem to proposing anything new. To ask for advice on using Python (including things like packaging dependencies), you probably should ask on the Python-List mailing list, also available on usenet as comp.lang.python. There may be some other dedicated mailing lists that specialise in packaging questions, check the mailing list server here: https://mail.python.org/mailman/listinfo I can't really help you with your question, except to point you in the direction of a little-known feature of Python: zip file application support: https://www.python.org/dev/peps/pep-0441/ https://docs.python.org/3/library/zipapp.html -- Steve

On Oct 27, 2016, at 02:50 PM, James Pic wrote:
You might want to look at the Snap ecosystem. It's fairly new, but it is cross-distro and cross-arch, and in many ways a very cool way to build self-contained applications where you control all the dependencies. You don't have to worry so much about each distribution's peculiarities, and Python gets first-class support[*]. There are lots of technical and philosophical aspects to Snaps that are off-topic for this mailing list, so I'll just point you to where you can explore it on your own: http://snapcraft.io/ Disclosure: I work for Canonical in my day job, which invented the technology, but it is in very large part an open source community project. Cheers, -Barry [*] In fact, the nice convenience front-end to building snaps is a Python 3 application.

On 27 October 2016 at 22:50, James Pic <jamespic@gmail.com> wrote:
You're right that this is a common problem, but it also isn't a language level problem - it's a software publication and distribution one, and for the Python community, the folks most actively involved in driving and/or popularising improvements in that space are those running packaging.python.org. While there's a fair bit of overlap between the two lists, the main home for those discussions is over on distutils-sig: https://mail.python.org/mailman/listinfo/distutils-sig (so called due to the standard library module that provided Python's original project-agnostic interface for building extension modules)
If you're not using C extensions (the closest Python equivalent to the typical jar use case), then ``zipapp`` should have you covered: https://docs.python.org/3/library/zipapp.html While the zipapp module itself is relatively new, the underlying interpreter and import system capabilities that it relies on have been around since Python 2.6.
For extension modules, you're facing a much harder problem than doing the same for pure Python code (where you can just use zipapp). However, engineering folks at Twitter put together pex, which gives you a full virtual environment to play with on the target system, and hence can handle extension modules as well: https://pex.readthedocs.io/en/stable/ The only runtime dependency pex places on the target system is having a Python runtime available. More generally, one of the major problems we have in this area at the moment is that a lot of the relevant information is just plain hard for people to find, so if this is an area you're interested in, then https://github.com/pypa/python-packaging-user-guide/issues/267 is aiming to pull together some of the currently available information into a more readily consumable form and is mainly waiting on a draft PR that attempts to make the existing content more discoverable. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi all, Please let me thank you for sharing some of your insight and passion, and for your tolerance, I'm sorry I thought it would be the best mailing list to go ahead and bluntly propose to have something like jars in python core. It's really great to see such a variety of solutions, and I've been trying to study some of them. The container image based solutions using LXD is definitely somewhere we're going on the long term, and conda seems like a good replacement for virtualenv in production. Best regards, James, from Angoulême B)

PS: FTR, pyinstaller seems widely use by a part of our community, but wasn't represented in this thread, perhaps this can give some good thinking to our devops community too :) http://www.pyinstaller.org/

On Thu, Oct 27, 2016 at 11:50 PM, James Pic <jamespic@gmail.com> wrote:
In theory, you could do that. You'd have to include *all* of Python, and all of everything else you might depend on, because you can't be sure what is and isn't available, so you might as well ship your app as a VM image or something, with an entire operating system. In practice, you're probably going to need to deal with some sort of package manager, and that's where the difficulties start. You can probably cover most of the Debian-based Linuxes by targeting either Debian Stable or Ubuntu LTS and creating a .deb file that specifies what versions of various libraries you need. There's probably a way to aim an RPM build that will work on RHEL, Fedora, SUSE, etc, but I'm not familiar with that family tree and where their library versions tend to sit. The trouble is that as soon as you land on an OS version that's too far distant from the one you built on, stuff will break. Between the bleeding-edge rolling distros and the super-stable ones could be over a decade of development (RHEL 4, released in 2005, is still supported). What you can probably do is ignore the absolute bleeding edge (anyone who's working on Debian Unstable or Arch or Crunchbang is probably aware of the issues and can solve them), and then decide how far back you support by looking at what you depend on, probably losing the very oldest of distributions. It should be possible to hit 95% of Linuxes out there by providing one .deb and one .rpm (per architecture, if you support multiple), but don't quote me on that figure. Unfortunately, the problem you're facing is virtually unsolvable, simply because the freedom of open systems means there is a LOT of variation out there. But most people on the outskirts are accustomed to doing their own dependency management (like when I used to work primarily on OS/2 - nobody supports it much, so you support it yourself). With all sincerity I say to you, good luck. Try not to lose the enthusiasm that I'm hearing from you at the moment! ChrisA

On Thu, Oct 27, 2016 at 8:50 AM, James Pic <jamespic@gmail.com> wrote:
Are you sure this is really what you need to do? With dependency handling, you can define the dependencies of your project and they will automatically get installed from pypi when the user tries to install the package (if they aren't already installed). manylinux wheels [1] allow you to distribute your own code in a manner that is compatible with most linux distributions, and many c-based projects now offer such wheels. Assuming your dependencies have version agnostic wheels (either manylinux or pure python), what would be the advantage to you of putting everything together in a single file? That being said, I suppose it would be possible to create your own manylinux wheels that include all the necessary dependencies, but that would make building more difficult and opens up the possibility that the installed modules will conflict with users' existing installed packages. Another possibility would be to use docker to create a container [2] that includes everything you need to run the code in an isolated environment that won't conflict [1] https://github.com/pypa/manylinux [2] https://www.digitalocean.com/community/tutorials/docker-explained-how-to-con...

OT, but.... Assuming your dependencies have version agnostic wheels (either manylinux
which is why conda exists -- conda can package up many things besides python packages, and you WILL need things besides python pacakges. building conda packages for everything you need, and then creating an environment.yaml file, and you can create a consistent environment very easy across systems (even across OSs) There is even a project (I forgot what its called -- "collections" maybe?) that bundles up a bunch of conda packages for you. and with conda-forge, there are an awful lot of packages already to go. If you need even more control over your environment, then Docker is the way to go -- you can even use conda inside docker... -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Oct 27, 2016 at 02:50:52PM +0200, James Pic wrote:
Your question is off-topic for this list. This list is for proposing new features for the Python language, but you don't seem to proposing anything new. To ask for advice on using Python (including things like packaging dependencies), you probably should ask on the Python-List mailing list, also available on usenet as comp.lang.python. There may be some other dedicated mailing lists that specialise in packaging questions, check the mailing list server here: https://mail.python.org/mailman/listinfo I can't really help you with your question, except to point you in the direction of a little-known feature of Python: zip file application support: https://www.python.org/dev/peps/pep-0441/ https://docs.python.org/3/library/zipapp.html -- Steve

On Oct 27, 2016, at 02:50 PM, James Pic wrote:
You might want to look at the Snap ecosystem. It's fairly new, but it is cross-distro and cross-arch, and in many ways a very cool way to build self-contained applications where you control all the dependencies. You don't have to worry so much about each distribution's peculiarities, and Python gets first-class support[*]. There are lots of technical and philosophical aspects to Snaps that are off-topic for this mailing list, so I'll just point you to where you can explore it on your own: http://snapcraft.io/ Disclosure: I work for Canonical in my day job, which invented the technology, but it is in very large part an open source community project. Cheers, -Barry [*] In fact, the nice convenience front-end to building snaps is a Python 3 application.

On 27 October 2016 at 22:50, James Pic <jamespic@gmail.com> wrote:
You're right that this is a common problem, but it also isn't a language level problem - it's a software publication and distribution one, and for the Python community, the folks most actively involved in driving and/or popularising improvements in that space are those running packaging.python.org. While there's a fair bit of overlap between the two lists, the main home for those discussions is over on distutils-sig: https://mail.python.org/mailman/listinfo/distutils-sig (so called due to the standard library module that provided Python's original project-agnostic interface for building extension modules)
If you're not using C extensions (the closest Python equivalent to the typical jar use case), then ``zipapp`` should have you covered: https://docs.python.org/3/library/zipapp.html While the zipapp module itself is relatively new, the underlying interpreter and import system capabilities that it relies on have been around since Python 2.6.
For extension modules, you're facing a much harder problem than doing the same for pure Python code (where you can just use zipapp). However, engineering folks at Twitter put together pex, which gives you a full virtual environment to play with on the target system, and hence can handle extension modules as well: https://pex.readthedocs.io/en/stable/ The only runtime dependency pex places on the target system is having a Python runtime available. More generally, one of the major problems we have in this area at the moment is that a lot of the relevant information is just plain hard for people to find, so if this is an area you're interested in, then https://github.com/pypa/python-packaging-user-guide/issues/267 is aiming to pull together some of the currently available information into a more readily consumable form and is mainly waiting on a draft PR that attempts to make the existing content more discoverable. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi all, Please let me thank you for sharing some of your insight and passion, and for your tolerance, I'm sorry I thought it would be the best mailing list to go ahead and bluntly propose to have something like jars in python core. It's really great to see such a variety of solutions, and I've been trying to study some of them. The container image based solutions using LXD is definitely somewhere we're going on the long term, and conda seems like a good replacement for virtualenv in production. Best regards, James, from Angoulême B)

PS: FTR, pyinstaller seems widely use by a part of our community, but wasn't represented in this thread, perhaps this can give some good thinking to our devops community too :) http://www.pyinstaller.org/
participants (7)
-
Barry Warsaw
-
Chris Angelico
-
Chris Barker
-
James Pic
-
Nick Coghlan
-
Steven D'Aprano
-
Todd