On Tue, Feb 06, 2018 at 05:55:36PM +1000, Nick Coghlan wrote:
The CalVer idea first came up in the context of skipping ahead in the numbering sequence to go straight to a baseline that supported ppc64le and/or aarch64. Even 2014 would likely be too old for that, since CentOS 7 didn't support those at launch, and neither did Ubuntu 14.04. While such a PEP hasn't actually been written yet, the kinds of numbers we were looking at for a suitable baseline year were around 2015 or 2016, as that's when support for them started showing up in mainline Linux distros.
Given that `manylinux` PEP numbers determine their sequence number, I don't see how CalVer would change the situation.
It lets us deterministically skip numbers if we decide we want to enable access to things that older platforms just straight up don't support (like new instruction set architectures).
A bigger issue is that `manylinux` isn't really one dimensional. Lots of things happened in 2014; for example, IBM shipped the first POWER8 systems and glibc 2.19 and 2.20 were released. But RHEL 7 and thus CentOS ship glibc 2.17. Why should `manylinux2014` support ppc64le but not glibc 2.19?
Mainly because we aim for "oldest version still used in new releases that year", but it's also why each version still needs a PEP that maps out the actual platform ABI as specific library versions.
Since the definition of `manylinux` depends on the state of RHEL and CentOS, maybe we should change the sequence number to match the underlying major release of RHEL/CentOS. That would have `manylinux2` become `manylinux6`, and its successor `manylinux7`. If we require that each `manylinux` support all the platforms its RHEL/CentOS supports, implementers and users could simply refer to that release to know what they're in for.
We discussed that too, and one key reason for not doing it is that we only build off Red Hat's platform definitions as a matter of convenience, and because they currently have the longest support lifecycles.
In the future, we could instead decide that a particular version of Ubuntu LTS or Debian stable (or even some other LTS distro) was a more suitable baseline for a given manylinux version, depending on how the relative timing works out.
For non-RHEL/CentOS users, the RHEL/CentOS version is also just as arbitrary a sequence number as 1-based indexing.
By contrast, year-based CalVer maintains distro-neutrality, while also giving a good sense of the maximum age of compatible target platforms. (e.g. given "manylinux2010", it's a pretty safe guess that Ubuntu 12.04, 14.04 and 16.04 are all expected to be compatible, while that isn't as clear given "manylinux2" or "manylinux6")
I'm convinced we should use CalVer. I'm still skeptical of the utility of CalVer here. Debian 6.0 (squeeze), for example, was released in 2011 but is incompatible with `manylinux2010` wheels because it uses glibc 2.11. I'm concerned that the sooner `manylinux2015` is defined, the more likely it is to describe too fuzzy an ABI era for CalVer to convey meaningful information to the LTS audience. What makes it worth it is the ability to skip and backfill versions. As you you pointed out, it would be a strange version scheme that had an architecture that gained wide support in 2015 become `manylinux3` and one that gained wide support in 2014 `manylinux4`. In particular, Geoffrey Thomas pointed out that it should be possible to produce nearly-`manylinux1` compliant wheels with a much newer toolchain: https://mail.python.org/pipermail/wheel-builders/2017-July/000283.html We may decide that an update to `manylinux1` is worthwhile, and by switching to CalVer, backfilling that version as `manylinux2008` would be straight forward. -- Mark Williams mrw@twistedmatrix.com