Proposal: using /etc/os-release in the "platform tag" definition for wheel files
We've discussed the idea of changing the wheel file naming scheme to deal with Linux previously, but never put together a concrete proposal. The closest we've got is the idea of allowing the platform tag to be customised in pip and perhaps bdist_wheel, and while that's good from an "enabling experimentation" perspective, it may be overkill if the primary goal is just to better support handling of Linux distros. For starters, here's the current definition of the platform tag in PEP 425: ================= The platform tag is simply distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . * win32 * linux_i386 * linux_x86_64 ================= Here's my proposed change: ================= The default platform tag is distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . If /etc/os-release [N] exists on the system, then the values in the 'ID' and 'VERSION_ID' fields are read, all hyphens - and periods . replaced with underscore _ , and the results appended to the default tag after a separating underscore." Examples: * win32 * macosx_10_6_intel * linux_x86_64_fedora_20 * linux_x86_64_rhel_7_0 * linux_x86_64_debian_7_0 * linux_x86_64_ubuntu_14_04 ================= The [N] reference would then be a reference to http://www.freedesktop.org/software/systemd/man/os-release.html for the definition of the format of os-release. (Note that while the format originated with systemd, plenty of distros have also started providing it regardless of which init system they use) Now, this slightly overspecifies on the *consumer* side. A binary wheel that works on "rhel_7_0" for example, should almost certainly work on "rhel_7_1". However, that can be addressed on the tooling side (e.g. permitting the specification of "additional compatible platforms" when invoking pip), rather than needing to be in the specification. This also won't help with older Linux distros that don't offer /etc/os-release, but I'm OK with that - those can just continue to show up as "linux_x86_64", and PyPI can continue to disallow those uploads. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, 28 Nov 2014 16:03:59 +1000
Nick Coghlan
Here's my proposed change:
================= The default platform tag is distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . If /etc/os-release [N] exists on the system, then the values in the 'ID' and 'VERSION_ID' fields are read, all hyphens - and periods . replaced with underscore _ , and the results appended to the default tag after a separating underscore."
Examples:
* win32 * macosx_10_6_intel * linux_x86_64_fedora_20 * linux_x86_64_rhel_7_0 * linux_x86_64_debian_7_0 * linux_x86_64_ubuntu_14_04
Is this not going to be a slippery slope?
Now, this slightly overspecifies on the *consumer* side. A binary wheel that works on "rhel_7_0" for example, should almost certainly work on "rhel_7_1". However, that can be addressed on the tooling side (e.g. permitting the specification of "additional compatible platforms" when invoking pip), rather than needing to be in the specification.
How about those lesser known distributions (e.g. Linux Mint or Mageia)? How many binary packages will package authors have to provide to cover people's needs? Windows + OS X + Linux multiplied by 32 / 64 multiplied by three or four Python versions is already a lot of binaries to build... While this would be a good technical solution, I think it's socially disastrous. Of course, you may point out that it has its roots in the failure of the GNU/Linux ecosystem to provide real binary compatibility. It's stunning that under Windows you can build a Windows XP-compatible shared library with a recent MSVC just by turning a switch in the options... Regards Antoine.
On 28 November 2014 at 18:19, Antoine Pitrou
On Fri, 28 Nov 2014 16:03:59 +1000 Nick Coghlan
wrote: Here's my proposed change:
================= The default platform tag is distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . If /etc/os-release [N] exists on the system, then the values in the 'ID' and 'VERSION_ID' fields are read, all hyphens - and periods . replaced with underscore _ , and the results appended to the default tag after a separating underscore."
Examples:
* win32 * macosx_10_6_intel * linux_x86_64_fedora_20 * linux_x86_64_rhel_7_0 * linux_x86_64_debian_7_0 * linux_x86_64_ubuntu_14_04
Is this not going to be a slippery slope?
Only if folks publish Linux binaries themselves, and that's still a bad idea (for the same reason publishing distro binaries is already a rare thing for people to do).
Now, this slightly overspecifies on the *consumer* side. A binary wheel that works on "rhel_7_0" for example, should almost certainly work on "rhel_7_1". However, that can be addressed on the tooling side (e.g. permitting the specification of "additional compatible platforms" when invoking pip), rather than needing to be in the specification.
How about those lesser known distributions (e.g. Linux Mint or Mageia)?
They tend to publish /etc/os-release as well these days, and there's actually a mechanism built into that for clients to flag other distros to try.
How many binary packages will package authors have to provide to cover people's needs? Windows + OS X + Linux multiplied by 32 / 64 multiplied by three or four Python versions is already a lot of binaries to build...
I'd still advise against folks posting Linux wheels on PyPI, just as they tend not to post RPM or deb files. This is so we can provide wheels at the distro level (or build them internally) without creating vast amounts of confusion.
While this would be a good technical solution, I think it's socially disastrous.
Only if you're expecting folks to regularly publish their own wheels to PyPI. This isn't really about that - it's about having a way to tackle it at the distro level, without introducing significant potential for confusion on end user systems (https://fedoraproject.org/wiki/Env_and_Stacks/Projects/LanguageSpecificRepos... describes the Fedora side of our current work in this area)
Of course, you may point out that it has its roots in the failure of the GNU/Linux ecosystem to provide real binary compatibility. It's stunning that under Windows you can build a Windows XP-compatible shared library with a recent MSVC just by turning a switch in the options...
The difference isn't really that surprising - both Microsoft and Apple have relied heavily on intellectual monopoly laws to retain control of their ecosystems. You can do a lot to constrain the choices of others when you have the full weight of the US government and copyright industry behind you. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, 29 Nov 2014 01:27:44 +1000
Nick Coghlan
Is this not going to be a slippery slope?
Only if folks publish Linux binaries themselves, and that's still a bad idea (for the same reason publishing distro binaries is already a rare thing for people to do).
Well, let's not make this a matter of ideology. Everyone knows it's a bad idea to publish binaries, yet it's often better than nothing, especially if the software is tedious to compile.
How many binary packages will package authors have to provide to cover people's needs? Windows + OS X + Linux multiplied by 32 / 64 multiplied by three or four Python versions is already a lot of binaries to build...
I'd still advise against folks posting Linux wheels on PyPI, just as they tend not to post RPM or deb files. This is so we can provide wheels at the distro level (or build them internally) without creating vast amounts of confusion.
So do we (software authors) have to wait for that mythical "we" who are going to build binaries in time for our packages? Case in point: can I ask you (the mythical "we") to build packages for all major distros (including supported LTS releases), and the four most recent Python versions, of the following piece of software: https://github.com/numba/llvmlite :-)
This isn't really about that - it's about having a way to tackle it at the distro level, without introducing significant potential for confusion on end user systems
I'm not sure I understand: distros provide their own packages, they don't (shouldn't) blindly pull binary wheels from PyPI. Why would they depend on the wheel tagging format?
The difference isn't really that surprising - both Microsoft and Apple have relied heavily on intellectual monopoly laws to retain control of their ecosystems. You can do a lot to constrain the choices of others when you have the full weight of the US government and copyright industry behind you.
That discussion is a bit off-topic, but I don't think it has anything to do with copyright (and from I've seen in python-dev discussions, I'm not sure Apple is a good example). Regards Antoine.
On 29 November 2014 at 01:51, Antoine Pitrou
On Sat, 29 Nov 2014 01:27:44 +1000 Nick Coghlan
wrote: Is this not going to be a slippery slope?
Only if folks publish Linux binaries themselves, and that's still a bad idea (for the same reason publishing distro binaries is already a rare thing for people to do).
Well, let's not make this a matter of ideology. Everyone knows it's a bad idea to publish binaries, yet it's often better than nothing, especially if the software is tedious to compile.
It's not a matter of ideology, but a matter of practicality. Debian stable, RHEL/CentOS, Ubuntu LTS, SLES - distros like these move slow enough (and have strong enough ABI compatibility guarantees) to be practical for ISVs to target with prebuilt binaries. Beyond that, the rate of development and breadth of target environments in the Linux world tends to make providing prebuilt binaries in *any* format difficult (unless you statically link everything you need, which works fine until you decide you want a particular package to depend on binaries provided by another one by linking to them dynamically). Regardless of target environment, though, being able to prebuild wheel files in a way that clearly indicates the platform restricted nature of the end result is a useful tool for system integrators. That benefit applies regardless of whether the builds are happening in the context of Linux distro development, or in the context of maintaining a particular set of infrastructure services.
How many binary packages will package authors have to provide to cover people's needs? Windows + OS X + Linux multiplied by 32 / 64 multiplied by three or four Python versions is already a lot of binaries to build...
I'd still advise against folks posting Linux wheels on PyPI, just as they tend not to post RPM or deb files. This is so we can provide wheels at the distro level (or build them internally) without creating vast amounts of confusion.
So do we (software authors) have to wait for that mythical "we" who are going to build binaries in time for our packages?
If the compatibility tagging issue can be resolved, I don't think there should be any restrictions against uploading Linux wheels that avoid the generic Linux platform tags. I just doubt it will make sense for most developers to worry about it, just as most of them don't worry about providing RPMs or Debian packages.
Case in point: can I ask you (the mythical "we") to build packages for all major distros (including supported LTS releases), and the four most recent Python versions, of the following piece of software: https://github.com/numba/llvmlite
No, that would be a service provided by the as yet hypothetical PyPI build farm. If/when that happens, it will need to have a way of tagging Linux wheels appropriately, though. Nearer term (and what prompted me to start this thread), the Fedora Environments & Stacks working group is investigating providing prebuilt wheel files for the Fedora ecosystem, and potentially for EPEL as well (see https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag... for the broader context of that effort). For other ecosystems, you'll have to ask participants in those ecosystems.
This isn't really about that - it's about having a way to tackle it at the distro level, without introducing significant potential for confusion on end user systems
I'm not sure I understand: distros provide their own packages, they don't (shouldn't) blindly pull binary wheels from PyPI. Why would they depend on the wheel tagging format?
We don't plan to blindly pull anything from PyPI - we're looking at the feasibility of publishing already reviewed software in ecosystem native formats (with the two pilot projects focusing on Java JAR files and Python wheel files). When I last mentioned that idea here, Marcus pointed out that doing that with the generic "linux_x86_64" compatibility tag on the wheel filenames would be problematic, as there'd be nothing preventing anyone from pulling them down onto inappropriate systems, with no obvious trace of the Fedora or EPEL specific assumptions that went into building them. While that's a valid concern, I also don't want to go invent our own custom compatibility tagging convention just for Fedora & EPEL, but rather work within the limits of what upstream Python packaging natively supports. At the moment there is no such tagging system, which is why I'm interested in pursuing improvements to the definition of the platform tag. However, after thinking further about the situation with EPEL (where we'd likely want a single set of wheel files to cover not only an entire major release series, but also downstream RHEL derivatives like CentOS), the possibility of sharing wheel files between the Fedora & Software Collections Python builds, and the point Matthias raised about the limitations of the current platform tag when it comes to multiarch support on Debian, I'm back to considering the idea of making it possible to override the default platform tag with something more appropriate (I think Daniel may have been the first one to suggest that?). The task of defining what the appropriate platform tag overrides should be would then fall back on the Linux distros as part of declaring our ABI compatibility expectations. Slightly tangentially, I'm also now wondering if we might also be able to address the SOABI problem on Python 2 with an appropriate configuration system design. Consider if the following could be included in the "pyvenv.cfg" file in a virtual environment: [compatibility] python=cp27,cp2,py2 abi=cp27mu platform=linux_x86_64_epel_6 Or for a Python 3 virtual environment: [compatibility] python=cp34,cp3,py3 abi=cp34m,abi3 platform=linux_x86_64_epel_6 If present, these pyvenv.cfg settings would override the normal PEP 425 compatibility tag calculations (I'm OK with the idea of *needing* to be in a virtual environment to gain the power to configure these tags). Note that this would *only* affect the tags used in filenames, and when searching for wheel files. It would still be up I'm not sold on the technical details yet (consider the above to be just a sketch of the idea), but it seems to me that having those virtual environment specific overrides defined in a tool independent way, and updating bdist_wheel to read them when creating wheel files, and pip to read them when deciding which wheel files to consider as candidates for installation, would likely address the use case, without locking us in to anything too specific in the absence of suitable data. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, 30 Nov 2014 01:47:16 +1000
Nick Coghlan
On 29 November 2014 at 01:51, Antoine Pitrou
wrote: On Sat, 29 Nov 2014 01:27:44 +1000 Nick Coghlan
wrote: Is this not going to be a slippery slope?
Only if folks publish Linux binaries themselves, and that's still a bad idea (for the same reason publishing distro binaries is already a rare thing for people to do).
Well, let's not make this a matter of ideology. Everyone knows it's a bad idea to publish binaries, yet it's often better than nothing, especially if the software is tedious to compile.
It's not a matter of ideology, but a matter of practicality. Debian stable, RHEL/CentOS, Ubuntu LTS, SLES - distros like these move slow enough (and have strong enough ABI compatibility guarantees) to be practical for ISVs to target with prebuilt binaries.
It seems we disagree on the notion of "practicality" :-) For me practicality means being able to build a single binary package for all recent Linux distros in a best effort approach. Building a different package for each distro version is far from practical for any reasonable-sized project (i.e. not something sponsored by a 1000+-employee entity, with a dedicated build team and infrastructure).
Case in point: can I ask you (the mythical "we") to build packages for all major distros (including supported LTS releases), and the four most recent Python versions, of the following piece of software: https://github.com/numba/llvmlite
No, that would be a service provided by the as yet hypothetical PyPI build farm. If/when that happens, it will need to have a way of tagging Linux wheels appropriately, though.
"If/when that happens" is not reassuring, especially in the light of how many pie-in-the-sky improvements in the packaging ecosystems have turned out :-/ (at Continuum we have started offering such a service, but it's "generic Linux": http://docs.binstar.org/build-config.html#BuildMatrix)
Nearer term (and what prompted me to start this thread), the Fedora Environments & Stacks working group is investigating providing prebuilt wheel files for the Fedora ecosystem, and potentially for EPEL as well (see https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag... for the broader context of that effort). For other ecosystems, you'll have to ask participants in those ecosystems.
That's asking software authors to complicate and slow down their development process a lot. Also, there's no guarantee that Fedora or Ubuntu or whatever would actually *accept* to help us, right?
I'm not sure I understand: distros provide their own packages, they don't (shouldn't) blindly pull binary wheels from PyPI. Why would they depend on the wheel tagging format?
We don't plan to blindly pull anything from PyPI - we're looking at the feasibility of publishing already reviewed software in ecosystem native formats (with the two pilot projects focusing on Java JAR files and Python wheel files).
When I last mentioned that idea here, Marcus pointed out that doing that with the generic "linux_x86_64" compatibility tag on the wheel filenames would be problematic, as there'd be nothing preventing anyone from pulling them down onto inappropriate systems, with no obvious trace of the Fedora or EPEL specific assumptions that went into building them.
Uh, there's a lot of hidden knowledge required to understand those two paragraphs that I don't master. I don't know what "inappropriate systems" are, what are "reviewed software", etc. ;-) Also I don't understand why you're not recompiling as you would normally do.
While that's a valid concern, I also don't want to go invent our own custom compatibility tagging convention just for Fedora & EPEL, but rather work within the limits of what upstream Python packaging natively supports.
Well, *allowing* distro tags in the platform tag is certainly ok. What I'm afraid of is if that's made mandatory.
Consider if the following could be included in the "pyvenv.cfg" file in a virtual environment:
[compatibility] python=cp27,cp2,py2 abi=cp27mu platform=linux_x86_64_epel_6
Or for a Python 3 virtual environment:
[compatibility] python=cp34,cp3,py3 abi=cp34m,abi3 platform=linux_x86_64_epel_6
If present, these pyvenv.cfg settings would override the normal PEP 425 compatibility tag calculations (I'm OK with the idea of *needing* to be in a virtual environment to gain the power to configure these tags).
As long as it's on an opt-in basis, it certainly sounds ok. Regards Antoine.
On 30 November 2014 at 02:10, Antoine Pitrou
On Sun, 30 Nov 2014 01:47:16 +1000 Nick Coghlan
wrote: On 29 November 2014 at 01:51, Antoine Pitrou
wrote: On Sat, 29 Nov 2014 01:27:44 +1000 Nick Coghlan
wrote: Is this not going to be a slippery slope?
Only if folks publish Linux binaries themselves, and that's still a bad idea (for the same reason publishing distro binaries is already a rare thing for people to do).
Well, let's not make this a matter of ideology. Everyone knows it's a bad idea to publish binaries, yet it's often better than nothing, especially if the software is tedious to compile.
It's not a matter of ideology, but a matter of practicality. Debian stable, RHEL/CentOS, Ubuntu LTS, SLES - distros like these move slow enough (and have strong enough ABI compatibility guarantees) to be practical for ISVs to target with prebuilt binaries.
It seems we disagree on the notion of "practicality" :-)
Note I said "ISVs" there - folks that actually make money from targetting those platforms. I don't think providing integrated Linux binaries is currently practical at all for open source projects (note that even CPython just provides a tarball upstream, with the builds handled directly by the distros).
For me practicality means being able to build a single binary package for all recent Linux distros in a best effort approach.
That isn't realistically possible for anything that isn't completely statically linked, as distro level choices of build options vary too much. We could technically allow statically linked binaries on PyPI by dropping the restriction against generic Linux wheel files, but I'm wary of that in the absence of an automated server side scan for dynamically linked binaries. Many users (quite reasonably, if they're primarily Python developers) have problems working through build failures when attempting to install non-Python extensions from source. Such build failures are usually models of clarity compared to diagnosing dynamic linking failures.
Building a different package for each distro version is far from practical for any reasonable-sized project (i.e. not something sponsored by a 1000+-employee entity, with a dedicated build team and infrastructure).
This is why I'm saying people *shouldn't* try to provide prebuilt Linux binaries in the general case: it's simply not practical. Hence the rise of higher level, more self-contained formats like Docker, as well as the fact that I'm not even trying to solve the "any Linux" problem myself, but going after the vastly simpler problem of targeting the ecosystem I need to target (but hopefully in a sufficiently general purpose way that other ecosystems can adopt the same model if they so choose).
Case in point: can I ask you (the mythical "we") to build packages for all major distros (including supported LTS releases), and the four most recent Python versions, of the following piece of software: https://github.com/numba/llvmlite
No, that would be a service provided by the as yet hypothetical PyPI build farm. If/when that happens, it will need to have a way of tagging Linux wheels appropriately, though.
"If/when that happens" is not reassuring, especially in the light of how many pie-in-the-sky improvements in the packaging ecosystems have turned out :-/
Nobody is currently offering to provide paid staff to work on it, so it won't happen until it gets to the top of volunteers' personal priority lists.
(at Continuum we have started offering such a service, but it's "generic Linux": http://docs.binstar.org/build-config.html#BuildMatrix)
Yes, Continuum avoided the distro ABI compatibility problem by defining its own ABI. It's exactly the same model I'm proposing - in the general case, you can't take Continuum built packages, and use them with Fedora/Debian/etc built dynamically linked dependencies. Same with my suggestion - it's about labelling the compatibility requirements, rather than trying to avoid them by statically linking everything.
Nearer term (and what prompted me to start this thread), the Fedora Environments & Stacks working group is investigating providing prebuilt wheel files for the Fedora ecosystem, and potentially for EPEL as well (see https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag... for the broader context of that effort). For other ecosystems, you'll have to ask participants in those ecosystems.
That's asking software authors to complicate and slow down their development process a lot.
If the distro level service doesn't benefit them, there's no reason for software authors to use it. This is about adding a new option for consuming software, not taking anything away. In my case, I won't deploy something until it's been through at least a basic licensing review, and if we're doing that work anyway, I may as well see if we can find a way to do it upstream in Fedora rather than inside the Red Hat firewall. Same for prebuilding binary wheels - sure, we could do that just for our own internal infrastructure, but it makes more sense to me to at least investigate the idea of handling it upstream first.
Also, there's no guarantee that Fedora or Ubuntu or whatever would actually *accept* to help us, right?
For Fedora, myself and Slavek (the Fedora Python maintainer) are the two folks working on this (& we're both voting members of the Environments & Stacks working group). So yes, in Fedora's case, it's definitely a developer experience problem we want to solve properly, and we're already working on it - this is me asking upstream for help resolving a design problem, rather than coming up with a speculative idea to persuade Fedora to adopt later. I realise that's very different from the historical attitude of Linux distributions to upstream packaging ecosystems. Getting to this point of adopting a more user-focused mindset when asking what services a Linux distro should be offering has been a formidable political challenge involving a lot of work from a lot of different people (including the current Fedora Project Leader), so I'm not even going to speculate on what might be involved in attempting to replicate that in the context of a different distro ecosystem :)
I'm not sure I understand: distros provide their own packages, they don't (shouldn't) blindly pull binary wheels from PyPI. Why would they depend on the wheel tagging format?
We don't plan to blindly pull anything from PyPI - we're looking at the feasibility of publishing already reviewed software in ecosystem native formats (with the two pilot projects focusing on Java JAR files and Python wheel files).
When I last mentioned that idea here, Marcus pointed out that doing that with the generic "linux_x86_64" compatibility tag on the wheel filenames would be problematic, as there'd be nothing preventing anyone from pulling them down onto inappropriate systems, with no obvious trace of the Fedora or EPEL specific assumptions that went into building them.
Uh, there's a lot of hidden knowledge required to understand those two paragraphs that I don't master. I don't know what "inappropriate systems" are, what are "reviewed software", etc. ;-)
An example of inappropriate systems: trying to install a Fedora wheel on Ubuntu. That often won't work in the general case, but pip will currently allow it (because they share a platform tag, despite not sharing a platform ABI) "Reviewed software" just refers to the Fedora package set. Currently we only make that available for installation as RPMs, the Environments & Stacks WG is looking at making it available in more developer friendly formats, rather than focusing solely on the operations use case. (We're not sure yet how we'll make that sustainable from a build infrastructure perspective, but that's a large part of what the pilot projects are aiming to figure out)
Also I don't understand why you're not recompiling as you would normally do.
We'll still be recompiling. We just want to publish the results for Python packages as virtual-environment compatible wheel files, not just as RPMs.
While that's a valid concern, I also don't want to go invent our own custom compatibility tagging convention just for Fedora & EPEL, but rather work within the limits of what upstream Python packaging natively supports.
Well, *allowing* distro tags in the platform tag is certainly ok. What I'm afraid of is if that's made mandatory.
OK, that makes more sense. Yes, I agree we need to keep the ability to say "this is a prebuilt, self-contained, binary wheel that should run on any Linux system because it doesn't link to any system binaries". Chalk it up as yet another reason that the specific proposal I started the thread with wouldn't actually work :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Binary compatibility is not something that is my strong suite, however I think that whatever we do should be aimed at eventual support of uploading Linux binaries to PyPI. Currently we allow Windows and OSX and from what I understand the same Linux ABI problems can happen in those situations, it's just less likely because of the single vendor nature of those platforms. In that vein I don't believe we need to *solve* the Linux ABI problem generically but that we need to get something that will most likely work. To this end the idea of using the distro name and version sounds like a solution that will get the Linux side of things closer to the Windows and OS X side. I do think we should continue to support the plain linux values and possibly we should even allow them to be uploaded to PyPI. I had also previously though about an additional file in the metadata spec which would encode the SOABI of each thing that has been dynamically linked (it's my understanding that you can inspect this?) and then when pip downloads a Wheel file it can use that file as a final and certain check of compatability. In this way the filename would only be used to try and select the file which is most likely to work on the system but that we can detect once we've downloaded the file if it will or will not work and fall back to attempt other files, possibly even the sdist. I don't think we need to try and protect people from uploading a badly built Wheel. If a project uploads a Wheel that has incorrect platform tags then that is a bug in their project as long as we provide reasonable differentiators that people can use to tag their Wheels with. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Sun, 30 Nov 2014 03:02:57 +1000
Nick Coghlan
Many users (quite reasonably, if they're primarily Python developers) have problems working through build failures when attempting to install non-Python extensions from source. Such build failures are usually models of clarity compared to diagnosing dynamic linking failures.
However, installing a binary doesn't imply a potential longish building step, or the installation of many build dependencies. LLVM can take 20 minutes to compile on a modern quad-core x86. I've been told it takes several hours on a Cortex A8 platform... By comparison, the failure of loading a precompiled dynamic library is instantaneous. And I don't think build failures are understandable by many users. You need to be a seasoned C developer for that.
(at Continuum we have started offering such a service, but it's "generic Linux": http://docs.binstar.org/build-config.html#BuildMatrix)
Yes, Continuum avoided the distro ABI compatibility problem by defining its own ABI.
Not exactly. Some ABI problems - for example the glibc-related ones - are still here. Conda and binstar-build are still a best effort (on the GNU/Linux side, that is), not an ideal solution.
Well, *allowing* distro tags in the platform tag is certainly ok. What I'm afraid of is if that's made mandatory.
OK, that makes more sense. Yes, I agree we need to keep the ability to say "this is a prebuilt, self-contained, binary wheel that should run on any Linux system because it doesn't link to any system binaries". Chalk it up as yet another reason that the specific proposal I started the thread with wouldn't actually work :)
Great! Regards Antoine.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/29/2014 11:10 AM, Antoine Pitrou wrote:
For me practicality means being able to build a single binary package for all recent Linux distros in a best effort approach.
I can't imagine finding any such binary useful (assuming a none-pure-Python project): the chance that it *might* blow up makes it easier just to create my own from the sdist. Such binaries would tend to bitrot, even if they were OK for the "supported" set of distros at the time they were made. How can such wheels be feasible, when cross-distro RPMs / .debs are clearly not (in the general sense)? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlR6BMUACgkQ+gerLs4ltQ70sgCeJ8BmRVDOhEzJ4I70eGWQCLn4 l0cAoJwoHcC26M/t5mhLXgJl4IrugDcV =G5zd -----END PGP SIGNATURE-----
On Fri, Nov 28, 2014 at 04:03:59PM +1000, Nick Coghlan wrote:
Here's my proposed change:
================= The default platform tag is distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . If /etc/os-release [N] exists on the system, then the values in the 'ID' and 'VERSION_ID' fields are read, all hyphens - and periods . replaced with underscore _ , and the results appended to the default tag after a separating underscore."
Examples:
* win32 * macosx_10_6_intel * linux_x86_64_fedora_20 * linux_x86_64_rhel_7_0 * linux_x86_64_debian_7_0 * linux_x86_64_ubuntu_14_04 =================
The [N] reference would then be a reference to http://www.freedesktop.org/software/systemd/man/os-release.html for the definition of the format of os-release. (Note that while the format originated with systemd, plenty of distros have also started providing it regardless of which init system they use) ... This also won't help with older Linux distros that don't offer /etc/os-release, but I'm OK with that - those can just continue to show up as "linux_x86_64", and PyPI can continue to disallow those uploads.
I was curious about what "older distros" meant in the context of Ubuntu, so I looked it up: /etc/os-release exists in Ubuntu 12.04 LTS (and newer) but didn't exist in Ubuntu 10.04 LTS. Support for Ubuntu 10.04 LTS ends in April 2015. Marius Gedminas -- Give a man a computer program and you give him a headache, but teach him to program computers and you give him the power to create headaches for others for the rest of his life... -- R. B. Forest
On 11/28/2014 07:03 AM, Nick Coghlan wrote:
We've discussed the idea of changing the wheel file naming scheme to deal with Linux previously, but never put together a concrete proposal.
The closest we've got is the idea of allowing the platform tag to be customised in pip and perhaps bdist_wheel, and while that's good from an "enabling experimentation" perspective, it may be overkill if the primary goal is just to better support handling of Linux distros.
For starters, here's the current definition of the platform tag in PEP 425:
hmm, maybe you repeat the rationale here for starters?
================= The platform tag is simply distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ .
* win32 * linux_i386 * linux_x86_64 =================
this is already wrong for ARM32 soft-float and hard-float, and x86_64-linux-gnu and x86_64-linux-gnux32. If something is changed, then please change that as well, maybe using something already defined like the multiarch triplet.
On 29 November 2014 at 01:31, Matthias Klose
On 11/28/2014 07:03 AM, Nick Coghlan wrote:
We've discussed the idea of changing the wheel file naming scheme to deal with Linux previously, but never put together a concrete proposal.
The closest we've got is the idea of allowing the platform tag to be customised in pip and perhaps bdist_wheel, and while that's good from an "enabling experimentation" perspective, it may be overkill if the primary goal is just to better support handling of Linux distros.
For starters, here's the current definition of the platform tag in PEP 425:
hmm, maybe you repeat the rationale here for starters?
To be able to publish wheel files for a particular ecosystem, without causing confusion if those wheel files somehow end up in an unsuitable environment (e.g. a Fedora specific wheel ending up on a Debian machine).
================= The platform tag is simply distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ .
* win32 * linux_i386 * linux_x86_64 =================
this is already wrong for ARM32 soft-float and hard-float, and x86_64-linux-gnu and x86_64-linux-gnux32. If something is changed, then please change that as well, maybe using something already defined like the multiarch triplet.
I'm open to completely redefining this in a distutils independent way, but it will need someone to define the precise algorithm. For the cases I'm personally worried about (i.e. Fedora & EPEL), the existing information extraction from os.uname() should be adequate. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
+1 to the idea in general.
Would this be an *edit* to PEP425/427/Wheel-1.0 OR new peps, and a new
wheel version?
As someone using cent6 daily (with no os-release file), I'm greedy for
another fallback technique, but the simplicity of just using os-release
makes sense.
Could a published "linux_x86_64_fedora_20" wheel ever become broken just
due to normal "yum update" activity on fedora_20? When? Why?
On Thu, Nov 27, 2014 at 10:03 PM, Nick Coghlan
We've discussed the idea of changing the wheel file naming scheme to deal with Linux previously, but never put together a concrete proposal.
The closest we've got is the idea of allowing the platform tag to be customised in pip and perhaps bdist_wheel, and while that's good from an "enabling experimentation" perspective, it may be overkill if the primary goal is just to better support handling of Linux distros.
For starters, here's the current definition of the platform tag in PEP 425:
================= The platform tag is simply distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ .
* win32 * linux_i386 * linux_x86_64 =================
Here's my proposed change:
================= The default platform tag is distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _ . If /etc/os-release [N] exists on the system, then the values in the 'ID' and 'VERSION_ID' fields are read, all hyphens - and periods . replaced with underscore _ , and the results appended to the default tag after a separating underscore."
Examples:
* win32 * macosx_10_6_intel * linux_x86_64_fedora_20 * linux_x86_64_rhel_7_0 * linux_x86_64_debian_7_0 * linux_x86_64_ubuntu_14_04 =================
The [N] reference would then be a reference to http://www.freedesktop.org/software/systemd/man/os-release.html for the definition of the format of os-release. (Note that while the format originated with systemd, plenty of distros have also started providing it regardless of which init system they use)
Now, this slightly overspecifies on the *consumer* side. A binary wheel that works on "rhel_7_0" for example, should almost certainly work on "rhel_7_1". However, that can be addressed on the tooling side (e.g. permitting the specification of "additional compatible platforms" when invoking pip), rather than needing to be in the specification.
This also won't help with older Linux distros that don't offer /etc/os-release, but I'm OK with that - those can just continue to show up as "linux_x86_64", and PyPI can continue to disallow those uploads.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
+1 to the idea in general. as well. Here is another use case to motivate distro-aware platform tags for linux wheels: speeding up continuous integration. Many Python projects related to the SciPy ecosystem by sharing use the following rackspace cloud storage container to mutualize build artifacts: http://wheels.scikit-image.org/ This makes it really easy to setup build chains to test libraries with against recent (or even development) version of they dependencies that cannot be apt-get installed on travis hosts for instance. Building from source is often not an option for a project that have a dependency on scipy for instance, the scipy build time itself would eat most of the travis job allowed time and would not leave enough time. And it would be a complete waste of resource to rebuild scipy over and over again. Unfortunately, all the wheels you find on http://wheels.scikit-image.org with the "linux_x86_64" are only expected to work on the Ubuntu Precise platform (which is used by travis). This ambiguous platform tag is really annoying: - it prevents people who would like to setup CI for other linux platforms (e.g. docker based CI engines that would test wheels on redhat or centos) to share the same distribution infrastructure, - the day travis changes its base distro we will have no way to detect which wheels were built on the old distro and which wheels are built on the newer version of the distro. Note that those wheels are not meant to be used by our end-users, only by CI tools for testing or by developers who would like to quickly reproduce without having to rebuild everything from scratch. If the experiment show that such distro tagged wheels are actually as stable as their are supposed too (which can be mostly tested via CI), then we could discuss later about the opportunity to distribute Linux wheels on PyPI for the end users but to me this is not a priority at all. -- Olivier
On 1 December 2014 at 11:41, Marcus Smith
+1 to the idea in general. Would this be an *edit* to PEP425/427/Wheel-1.0 OR new peps, and a new wheel version?
I'm currently thinking no change to the wheel spec, but potentially a PEP to define a standard way to override and/or supplement the default compatibility tags.
As someone using cent6 daily (with no os-release file), I'm greedy for another fallback technique, but the simplicity of just using os-release makes sense. Could a published "linux_x86_64_fedora_20" wheel ever become broken just due to normal "yum update" activity on fedora_20? When? Why?
It's technically possible to get an ABI break mid-cycle on Fedora, as it doesn't have the same level of restrictions against rebasing components that RHEL/CentOS do. Actually encountering an ABI break would still be pretty unlucky though. However, I realised my original idea doesn't quite work, since derivative distros may strive to be ABI compatible with their upstreams. While /etc/os-release sort of supports that concept via ID_LIKE, it's entirely vague about what that actually means, and the "ID_LIKE" distro references aren't versioned at all. There's also the fact that at least RHEL/CentOS aim to remain ABI compatible for the duration of an entire release series, so including the full version of /etc/os-release would overspecify things. That got me back to the idea of being able to customise the platform tag which various folks have brought up in the past. At that point, the definition in PEP 425 would become the specification for the *default* platform tag, and we'd look at adding ways to override it both when building wheels and when installing them. The main possibility I thought of there was to come up with a convention for overriding the platform tag at the *virtual environment* level. Then bdist_wheel, pip and other projects could check for the override before falling back to the default definition from PEP 425. If we designed the mechanisms correctly, it could potentially also be used to address the "no SOABI details" problem on Python 2.7. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (8)
-
Antoine Pitrou
-
Donald Stufft
-
Marcus Smith
-
Marius Gedminas
-
Matthias Klose
-
Nick Coghlan
-
Olivier Grisel
-
Tres Seaver