[Distutils] Working toward Linux wheel support

Nate Coraor nate at bx.psu.edu
Thu Aug 20 20:26:44 CEST 2015

On Fri, Aug 14, 2015 at 3:38 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Aug 13, 2015 at 7:27 PM, Robert Collins
> <robertc at robertcollins.net> wrote:
> > On 14 August 2015 at 14:14, Nathaniel Smith <njs at pobox.com> wrote:
> > ...>
> >> Of course if you have an alternative proposal than I'm all ears :-).
> >
> > Yeah :)
> >
> > So, I want to dedicate some time to contributing to this discussion
> > meaningfully, but I can't for the next few weeks - Jury duty, Kiwi
> > PyCon and polishing up the PEP's I'm already committed to...
> Totally hear that... it's not super urgent anyway. We should make it
> clear to Nate -- hi Nate! -- that there's no reason that solving this
> problem should block putting together the basic
> binary-compatibility.cfg infrastructure.


I've been working on bits of this as I've also been working on, as a test
case, building out psycopg2 wheels for lots of different popular distros on
i386 and x86_64, UCS2 and UCS4, under Docker. As a result, it's clear that
my Linux distro tagging work in wheel's pep425tags has some issues. I've
been adding to this list of distributions but it's going to need a lot more


So I need a bit of guidance here. I've arbitrarily chosen some tags -
`rhel` for example - and wonder if, like PEP 425's mapping of Python
implementations to tags, a defined mapping of Linux distributions to
shorthand tags is necessary (of course this would be difficult to keep up
to date, but binary-compatibility.cfg would make it less relevant in the
long run).

Alternatively, I could simply trust and normalize
platform.linux_distribution()[0], but this means that the platform tag on
RHEL would be something like

Finally, by *default*, the built platform tag will include whatever version
information is provided in platform.linux_distribution()[1], but the
"major-only" version is also included in the list of platforms, so a
default debian tag might look like `linux_x86_64_debian_7_8`, but it would
be possible to build (and install) `linux_x86_64_debian_7`. However, it may
be the case that the default (at least for building, maybe not for
installing) ought to be the major-only tag since it should really be ABI
compatible with any minor release of that distro.


> > I think the approach of being able to ask the *platform* for things
> > needed to build-or-use known artifacts is going to enable a bunch of
> > different answers in this space. I'm much more enthusiastic about that
> > than doing anything that ends up putting PyPI in competition with the
> > distribution space.
> >
> > My criteria for success are:
> >
> > - there's *a* migration path from what we have today to what we
> > propose. Doesn't have to be good, just exist.
> >
> >  - authors of scipy, numpy, cryptography etc can upload binary wheels
> > for *linux, Mac OSX and Windows 32/64 in a safe and sane way
> So the problem is that, IMO, "sane" here means "not building a
> separate wheel for every version of distro on distrowatch". So I can
> see two ways to do that:
> - my suggestion that we just pick a particular highly-compatible
> distro like centos 5 to build against, and make a standard list of
> which libraries can be assumed to be provided
> - the PEP-497-or-number-to-be-determined approach, in which we still
> have to pick a highly-compatible distro like centos 5 to build
> against, but each wheel has a list of which libraries from that distro
> it is counting on being provided
> I can see the appeal of the latter approach, since if you want to do
> the former approach right you need to be careful about exactly which
> libraries you're assuming are present, etc. They both could work. But
> in practice, you still have to pick which distro you are going to use
> to build, and you still have to say "when I say I need libblas.so.1,
> what I mean is that I need a file that is ABI-compatible with the
> version of libblas.so.1 that existed in centos 5 exactly, not any
> other libblas.so.1". And then in practice not every distro will have
> such a thing, so for a project like numpy that wants to make things
> easy for a wide variety of users, we'll still only be able to take
> advantage of external dependencies for libraries that are effectively
> universally available and compatible anyway and end up vendoring the
> rest... so in the end basically we'd be distributing exactly the same
> wheels under either of these proposals, just the latter requires a
> much much more complicated scheme for metadata and installation.
> And in practice I think the main alternative possibility if we don't
> come up with some solid guidance for how packages can build
> works-everywhere-wheels is that we'll see wheels for
> latest-version-of-Ubuntu-only, plus the occasional smattering of other
> distros, varying randomly on a project-by-project basis. Which would
> suck.
> >  - we don't need to do things like uploading wheels containing
> > non-Python shared libraries, nor upload statically linked modules
> >
> >
> > In fact, I think uploading regular .so files is just a huge heartache
> > waiting to happen, so I'm almost inclined to add:
> >
> >  -  we don't support uploading external non-Python libraries [ without
> > prejuidice for changing our minds in the future]
> Windows and OS X don't (reliably) have any package manager. So PyPI
> *is* inevitably going to contain non-Python shared libraries or
> statically linked modules or something like that. (And in fact it
> already contains such things today.) I'm not sure what the alternative
> would even be.
> This also means that projects like numpy are already forced to accept
> that we're on the hook for security updates in our dependencies etc.,
> so doing it on Linux too is not really that scary.
> Oh, I just thought of another issue: an extremely important
> requirement for numpy/scipy/etc. wheels is that they be reliably
> installable without root access. People *really* care about this:
> missing your grant deadline b/c you can't upgrade some package to fix
> some showstopper bug b/c university IT support is not answering calls
> at midnight on Sunday = rather poor UX.
> Given that, the only situation I can see where we would ever
> distribute wheels that require system blas on Linux, is if we were
> able to do it alongside wheels that do not require system blas, and
> pip were clever enough to reliably always pick the latter except in
> cases where the system blas was actually present and working.
> > There was a post that referenced a numpy ABI, dunno if it was in this
> > thread - I need to drill down into that, because I don't understand
> > why thats not a regular version resolution problem,unlike the Python
> > ABI, which pip can't install [and shouldn't be able to!]
> The problem is that numpy is very unusual among Python packages in
> that exposes a large and widely-used *C* API/ABI:
>     http://docs.scipy.org/doc/numpy/reference/c-api.html
> This means that when you build, e.g., scipy, then you get a binary
> that depends on things like the in-memory layout of numpy's internal
> objects. We'd like it to be the case that when we release a new
> version of numpy, pip could realize "hey, this new version says it has
> an incompatible ABI that will break your currently installed version
> of scipy -- I'd better fetch a new version of scipy as well, or at
> least rebuild the same version I already have". Notice that at the
> time scipy is built, it is not known which future version of numpy
> will require a rebuild. There are a lot of ways this might work on
> both the numpy and pip sides -- definitely fodder for a separate
> thread -- but that's the basic problem.
> -n
> --
> Nathaniel J. Smith -- http://vorpus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20150820/b663020c/attachment-0001.html>

More information about the Distutils-SIG mailing list