[Distutils] draft PEP: manylinux1

M.-A. Lemburg mal at egenix.com
Fri Jan 22 07:07:25 EST 2016

On 22.01.2016 12:25, Donald Stufft wrote:
>> On Jan 22, 2016, at 5:48 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Embedding additional libraries in the wheels files to overcome
>> deficiencies in the PEP design simply doesn't feel right
>> to me.
>> People who rely on Linux distributions want to continue
>> to do so and get regular updates for system packages from
>> their system vendor. Having wheel files override these
>> system packages by including libs directly in the wheel
>> silently breaks this expectation, potentially opening
>> up the installations for security holes, difficult to
>> track bugs and possible version conflicts with already
>> loaded versions of the shared libs.
>> IMO, that's much worse than having to install additional
>> system packages to make a Python wheel work.
>> The embedding approach also creates licensing problems,
>> since those libs may be under different licenses than the
>> package itself. And of course, it increases the size of the
>> wheel files, causing more bandwidth to be necessary,
>> more disk space to be used for wheel caches, etc.
> I think there are a few things here, but this is not my area of expertise so I
> could be wrong. As I understand it, The manylinux platform definition is
> largely going to be a documentation effort and there isn't going to be much in
> the way of enforcement. That means that people who build wheels against the
> manylinux platform tag are free to really do whatever they want even if it
> doesn't strictly match the definition of the manylinux platform. The difference
> here is that if you link against something that isn't included in the set of
> libraries, and that subsequently breaks due to an ABI incompatability, that's
> not a pip bug or a manylinux bug, that's a packaging bug with that particular
> library and they'll have to decide how they want to resolve it (if they want
> to resolve it). So you'll be free to link to anything you want, but you get to
> keep both pieces if it breaks and it's outside this defined set of libraries.

Hmm, if that were the reading, things would look a lot brighter,
but if PyPI will start to only support uploading manylinux wheels
for Linux platforms, you essentially have the effect that the
PEP ends up defining the set of allowed external libraries and forces
package authors to embed any other external libraries into the
wheel file - or not be able to upload wheel files for Linux at all.

This can hardly be in the interest of Python users who don't want
to use wheel embedded system libraries on their Linux system and
most likely also don't expect wheel files to ship alternative
versions with them in the first place.

If we'd lift the ban of "linux_*" tagged wheels on PyPI at
the same time we allow "manylinux" wheels, that'd remove a lot
of my concerns.

In that case, I'd just like to see a way to tell pip not to install
manylinux wheels with embedded system libraries, or simply outright
reject embedded system libraries in manylinux wheel files.

> I also agree that it's OK for users to have to ``apt-get`` (or whatever) a
> particular library to use something and we don't have to *only* rely on items
> that are installed as part of a "standard" linux base system. However, what is
> not OK (IMO) is for the PEP to bless something that has a high chance of ending
> up with ABI issues rather than "need to apt-get install" issues. For instance,
> even if you compile against a sufficiently old copy of OpenSSL, OpenSSL (to my
> understanding) does not have a stable ABI and you cannot take something
> compiled against OpenSSL on CentOS 5.reallyold and expect it to work on say
> Arch Linux.

True. There will always be incompatibilities out there which
cannot be addressed with a one-fits-all approach. For those
cases, vendor specific wheels would need to be created.

> So I think there's an explicit list of packages that we know will generally
> work as long as you build against a sufficiently old copy of them and outside
> of that it's really a big unknown in general if a particular library can be
> used in this way or not. We obviously can't enumerate the list of every
> possible C library that has a stable ABI that can sanely be used cross distro
> but I think it's reasonable to list some sort of base minimum here, and if
> people experiment with stepping outside the standard list and can come to us
> and show "hey, I tried it with xyz library, we've gotten X installs and no
> complaints" we can then possibly expand the definition of the manylinux
> platform to include that library and move that project from depending on
> undefined behavior to defined behavior.
> Thinking of it in terms of a C like "undefined behavior" is probably a
> reasonable way of doing it. Linking against a system provided library that is
> on this list is a defined behavior of the manylinux "platform", linking against
> something else is undefined and may or may not work. At some level, once you've
> gotten to the point you're using pip to manage some set of your packages it
> doesn't really matter if that set of things you're pulling from PyPI includes
> a C library or not. If you're relying on say psycopg2 it's not clear to me that
> libpq *needs* to be getting security any more than psycopg2 itself does and so
> you'll need some method of solving that problem for your Python level
> dependencies anyways.

You need both: solving issues at the Python level and at the
system level.

However, system vendors will often be a lot faster with updates
than package authors, simply because it's their business model,
so as user you will want to benefit from those updates and
not have to rely on the package author to ship new wheel files.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 22 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the Distutils-SIG mailing list