Re: [Distutils] draft PEP: manylinux2

10 Feb 2018

      On Tue, Feb 06, 2018 at 05:55:36PM +1000, Nick Coghlan wrote:
...
The CalVer idea first came up in the context of skipping ahead in the
numbering sequence to go straight to a baseline that supported ppc64le
and/or aarch64. Even 2014 would likely be too old for that, since
CentOS 7 didn't support those at launch, and neither did Ubuntu 14.04.
While such a PEP hasn't actually been written yet, the kinds of
numbers we were looking at for a suitable baseline year were around
2015 or 2016, as that's when support for them started showing up in
mainline Linux distros.
...
Given that `manylinux` PEP numbers determine their sequence number, I
don't see how CalVer would change the situation.
It lets us deterministically skip numbers if we decide we want to
enable access to things that older platforms just straight up don't
support (like new instruction set architectures).
...
A bigger issue is that `manylinux` isn't really one dimensional.  Lots
of things happened in 2014; for example, IBM shipped the first POWER8
systems and glibc 2.19 and 2.20 were released.  But RHEL 7 and thus
CentOS ship glibc 2.17.  Why should `manylinux2014` support ppc64le
but not glibc 2.19?
Mainly because we aim for "oldest version still used in new releases
that year", but it's also why each version still needs a PEP that maps
out the actual platform ABI as specific library versions.
...
Since the definition of `manylinux` depends on the state of RHEL and
CentOS, maybe we should change the sequence number to match the
underlying major release of RHEL/CentOS.  That would have `manylinux2`
become `manylinux6`, and its successor `manylinux7`.  If we require
that each `manylinux` support all the platforms its RHEL/CentOS
supports, implementers and users could simply refer to that release to
know what they're in for.
We discussed that too, and one key reason for not doing it is that we
only build off Red Hat's platform definitions as a matter of
convenience, and because they currently have the longest support
lifecycles.
In the future, we could instead decide that a particular version of
Ubuntu LTS or Debian stable (or even some other LTS distro) was a more
suitable baseline for a given manylinux version, depending on how the
relative timing works out.
For non-RHEL/CentOS users, the RHEL/CentOS version is also just as
arbitrary a sequence number as 1-based indexing.
By contrast, year-based CalVer maintains distro-neutrality, while also
giving a good sense of the maximum age of compatible target platforms.
(e.g. given "manylinux2010", it's a pretty safe guess that Ubuntu
12.04, 14.04 and 16.04 are all expected to be compatible, while that
isn't as clear given "manylinux2" or "manylinux6")
I'm convinced we should use CalVer.

I'm still skeptical of the utility of CalVer here.  Debian 6.0
(squeeze), for example, was released in 2011 but is incompatible with
`manylinux2010` wheels because it uses glibc 2.11.  I'm concerned that
the sooner `manylinux2015` is defined, the more likely it is to
describe too fuzzy an ABI era for CalVer to convey meaningful
information to the LTS audience.

What makes it worth it is the ability to skip and backfill versions.
As you you pointed out, it would be a strange version scheme that had
an architecture that gained wide support in 2015 become `manylinux3`
and one that gained wide support in 2014 `manylinux4`.

In particular, Geoffrey Thomas pointed out that it should be possible
to produce nearly-`manylinux1` compliant wheels with a much newer
toolchain:

https://mail.python.org/pipermail/wheel-builders/2017-July/000283.html

We may decide that an update to `manylinux1` is worthwhile, and by
switching to CalVer, backfilling that version as `manylinux2008` would
be straight forward.

--
  Mark Williams
  mrw@twistedmatrix.com