Opinions on requiring younger glibc in manylinux1 wheel?
Hi, According to recent messages, it seems manylinux2010 won't be ready soon. However, the baseline software in manylinux1 is becoming very old. As an example, a popular C++ library (Abseil - https://abseil.io/) requires a more recent glibc (see https://github.com/abseil/abseil-cpp/commit/add89fd0e4bfd7d874bb55b67f4e13bf...). What do you think of publishing manylinux1 wheels that would require a more recent glibc? This is being discussed currently for the pyarrow package. Regards Antoine.
On Mon, Sep 17, 2018 at 11:24 AM Antoine Pitrou <antoine@python.org> wrote:
According to recent messages, it seems manylinux2010 won't be ready soon. However, the baseline software in manylinux1 is becoming very old. As an example, a popular C++ library (Abseil - https://abseil.io/) requires a more recent glibc (see https://github.com/abseil/abseil-cpp/commit/add89fd0e4bfd7d874bb55b67f4e13bf... ).
What do you think of publishing manylinux1 wheels that would require a more recent glibc? This is being discussed currently for the pyarrow package.
I think this will require updating the PEP, at the very least: https://www.python.org/dev/peps/pep-0513/#the-manylinux1-policy I think we should focus our efforts on releasing manylinux2010 ASAP: https://www.python.org/dev/peps/pep-0571/ We need the community's help to review the PR here: https://github.com/pypa/manylinux/pull/182 Thanks, Trishank
Trishank Kuppusamy wrote:
I think this will require updating the PEP, at the very least:
Sorry, there was a misunderstanding. Maybe I should have been clearer. My question was about publishing deliberately incompatible manylinux1 wheels (without changing the PEP). Regards Antoine.
On Mon, Sep 17, 2018 at 11:37 AM Antoine Pitrou <antoine@python.org> wrote:
Sorry, there was a misunderstanding. Maybe I should have been clearer. My question was about publishing deliberately incompatible manylinux1 wheels (without changing the PEP).
Ah, I see. Hmm, well, I guess this is all right on a private index, not PyPI, but that's just my 0.02 BTC. We are looking for help to review manylinux2010, though: https://github.com/pypa/manylinux/pull/182
On Mon, 17 Sep 2018 at 16:48, Trishank Kuppusamy <trishank.kuppusamy@datadoghq.com> wrote:
On Mon, Sep 17, 2018 at 11:37 AM Antoine Pitrou <antoine@python.org> wrote:
Sorry, there was a misunderstanding. Maybe I should have been clearer. My question was about publishing deliberately incompatible manylinux1 wheels (without changing the PEP).
Ah, I see. Hmm, well, I guess this is all right on a private index, not PyPI, but that's just my 0.02 BTC.
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports). It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them... Paul
On Mon, Sep 17, 2018 at 11:59 AM Paul Moore <p.f.moore@gmail.com> wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with you. However, if people are using this on a private index, then I guess the onus is on them. I'm not married to this idea, though!
Paul Moore wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with that. OTOH it seems providing binary wheels is generally a strong demand from the community. I would be fine with only providing conda packages myself. By the way other packages are already doing worse: https://github.com/tensorflow/tensorflow/issues/8802 Regards Antoine.
On Mon, Sep 17, 2018 at 6:07 PM Antoine Pitrou <antoine@python.org> wrote:
Paul Moore wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with that. OTOH it seems providing binary wheels is generally a strong demand from the community. I would be fine with only providing conda packages myself.
The biggest demand seems to be for developer convenience of quick downloads / installs and by people whom have not delved very deep into the gnarly black arts of cross compilation and forwards / backwards compatibility maintenance. Deployment bandwidth costs and install times are a second tier use, but still a real concern to any parties whom should consider sponsoring any effort going towards solving anything within the scope, as solving their gripes would save them money. By the way other packages are already doing worse:
Domain specific packages with real industry needs will need to deviate from any standard put forth as the world of the bleeding edge moves faster than the standards can. What a lot of packages would actually need, is to have per operating system per distro per distro version wheels, but that'd get quite insane quick and put a lot of effort onto the package maintainers or the maintainers of the manylinux-esque build containers. And even something like that will still spectacularly fall apart on macOS by stuff like building against 3rd party libraries from macports vs. fink vs. homebrew installed into /usr/local/ vs. homebrew installed into $HOME/.homebrew varying between the unsuspecting package maintainer / wheel builder and the end users of the wheel. Oddly enough this seems to be by far the least problematic on Windows. -- Joni Orponen
On Mon, Sep 17, 2018, 18:51 Joni Orponen <j.orponen@4teamwork.ch> wrote:
On Mon, Sep 17, 2018 at 6:07 PM Antoine Pitrou <antoine@python.org> wrote:
Paul Moore wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with that. OTOH it seems providing binary wheels is generally a strong demand from the community. I would be fine with only providing conda packages myself.
The biggest demand seems to be for developer convenience of quick downloads / installs and by people whom have not delved very deep into the gnarly black arts of cross compilation and forwards / backwards compatibility maintenance.
Deployment bandwidth costs and install times are a second tier use, but still a real concern to any parties whom should consider sponsoring any effort going towards solving anything within the scope, as solving their gripes would save them money.
By the way other packages are already doing worse:
Domain specific packages with real industry needs will need to deviate from any standard put forth as the world of the bleeding edge moves faster than the standards can.
What a lot of packages would actually need, is to have per operating system per distro per distro version wheels, but that'd get quite insane quick and put a lot of effort onto the package maintainers or the maintainers of the manylinux-esque build containers.
I'm doubtful that there are many packages that "need" this. People don't do this on Windows or macOS, and those platforms seem to do ok. Still we should have some way to describe such packages, so tensorflow can at least have be accurate metadata, and for a variety of other use cases (Alpine, arm, conda, etc.).
And even something like that will still spectacularly fall apart on macOS by stuff like building against 3rd party libraries from macports vs. fink vs. homebrew installed into /usr/local/ vs. homebrew installed into $HOME/.homebrew varying between the unsuspecting package maintainer / wheel builder and the end users of the wheel.
This isn't really an issue. Whatever libraries you need should be vendored into the wheel with a tool like 'delocate', and then it doesn't matter what third-party package manager your end users do or don't use.
Oddly enough this seems to be by far the least problematic on Windows.
There's no real difference between Windows/macOS/Linux in terms of binary compatibility. On Windows people are more used to shipping everything with their package, that's all. If you do the same thing on macOS and Linux, it works great. -n
On Tue, 18 Sep 2018 at 11:51, Joni Orponen <j.orponen@4teamwork.ch> wrote:
On Mon, Sep 17, 2018 at 6:07 PM Antoine Pitrou <antoine@python.org> wrote:
Paul Moore wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with that. OTOH it seems providing binary wheels is generally a strong demand from the community. I would be fine with only providing conda packages myself.
The biggest demand seems to be for developer convenience of quick downloads / installs and by people whom have not delved very deep into the gnarly black arts of cross compilation and forwards / backwards compatibility maintenance.
Deployment bandwidth costs and install times are a second tier use, but still a real concern to any parties whom should consider sponsoring any effort going towards solving anything within the scope, as solving their gripes would save them money.
By the way other packages are already doing worse: https://github.com/tensorflow/tensorflow/issues/8802
Domain specific packages with real industry needs will need to deviate from any standard put forth as the world of the bleeding edge moves faster than the standards can.
Implementing the necessary changes [1] to support manylinux2010 across the various tools and components isn't hard (so nobody is going to do it for the intellectual challenge), it's just tedious (so nobody is likely to do it for fun, either). Unfortunately, even large companies like Google are mostly sitting back and passively consuming Python packaging projects maintained by folks in their spare time (see https://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.h...). Google could likely address the bulk of Python's near term Linux packaging maintainability problems simply by advertising for a full-time "Python project packaging at Google" maintainer role, and setting whoever they hire loose on the problems of getting manylinux2010 rolled out, defining manylinux2014 (the RHEL/CentOS 7/Ubuntu 14.04 baseline), and defining official support for distro-specific wheels (such that they can publish their Ubuntu-only tensorflow packages properly, without messing up the package installation experience for users of other distros). Google as a company know how that works, since they do things properly for other aspects of Python development (such as gradual typing). That's why their deliberate violation of the manylinux1 spec with their tensorflow packages is so egregious - for whatever reason, it hasn't occurred to them that the only reason their tensorflow workflows are currently working without massive build times is because we're not actively enforcing spec compliance at the PyPI level. If PyPI were to ever start running auditwheel on upload, and subsequently hide manylinux wheels that failed the check (such that installers were able to properly meet the binary compatibility assurances nominally offered by the specifications). Cheers, Nick. [1] https://github.com/pypa/manylinux/issues/179 -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, 17 Sep 2018 at 17:08, Antoine Pitrou <antoine@python.org> wrote:
Paul Moore wrote:
I'm not really familiar with manylinux1, but I'd be concerned if we started getting bug reports on pip because we installed a library that claimed to be manylinux1 and was failing because it wasn't. (And yes, packaging errors like this are a common source of pip bug reports).
It seems to me that it's defeating the purpose of having standards if people aren't willing to follow them...
I agree with that. OTOH it seems providing binary wheels is generally a strong demand from the community. I would be fine with only providing conda packages myself.
I'm not going to argue conda vs wheels. As you've noted there's strong community interest in having wheels, but it's up to you whether you prefer to satisfy that interest or not. I will say that if there are problems with the existing standards that cause you issues in distributing your packages as wheels, then working with the community to address those issues with the standards is what I'd recommend. It's hard to see how anyone other than you could be better placed to explain the issues you face and propose solutions. On the other hand, no-one's forcing you to participate in the standards process, it's fine (as I said) if you prefer not to distribute manylinux1 wheels. What's not fine (IMO) is to effectively sabotage the standards process by distributing wheels that claim to conform to it but which actually don't. That makes life harder for everyone trying to make the standards work. I really appreciate the fact that you raised the question rather than simply doing so - once such wheels are published, it'd be hard to track what's going on, much less address the issue.
By the way other packages are already doing worse: https://github.com/tensorflow/tensorflow/issues/8802
I know tensorflow have issues that the current standards don't really address. I believe they are discussing solutions in various places like the packaging issues and the wheel tracker. I wasn't aware that in the meantime they were distributing non-conformant wheels. That's not good, as you yourself state in https://github.com/tensorflow/tensorflow/issues/8802#issuecomment-401703703. FWIW, my understanding is that the libc restriction is to ensure compatibility with one of the older but supported RHEL versions (6?). I can appreciate that this might not be an important use case for some projects. Some final thoughts: 1. It sounds like manylinux2010 may be what you want. If you (the general "you" here - any project for which manylinux1 isn't sufficient) can't help move that effort forward, then you'll probably have to wait until those who can get it finalised. 2. Maybe there's value in a tag that emphasises "current hardware" more than backward compatibility? I don't know how such a thing could be usefully defined - it may not even be possible in any real sense - but that's a whole other standard proposal. Paul PS I'm assuming we're talking about publishing *on PyPI* - what people do on a private index is their own concern.
On Mon, Sep 17, 2018, 08:25 Antoine Pitrou <antoine@python.org> wrote:
Hi,
According to recent messages, it seems manylinux2010 won't be ready soon. However, the baseline software in manylinux1 is becoming very old. As an example, a popular C++ library (Abseil - https://abseil.io/) requires a more recent glibc (see https://github.com/abseil/abseil-cpp/commit/add89fd0e4bfd7d874bb55b67f4e13bf... ).
What do you think of publishing manylinux1 wheels that would require a more recent glibc? This is being discussed currently for the pyarrow package.
It's naughty, you shouldn't do it, and the energy you put into making pseudo-manylinux1 wheels could probably be better put into making finishing up the manylinux2010 work – there's not that much to do. That said, if you do do it, then probably it'll work fine and no one will notice. -n
Nathaniel Smith wrote:
It's naughty, you shouldn't do it, and the energy you put into making pseudo-manylinux1 wheels could probably be better put into making finishing up the manylinux2010 work – there's not that much to do.
Can you explain what's missing? Paul Moore wrote:
1. It sounds like manylinux2010 may be what you want.
Definitely.
2. Maybe there's value in a tag that emphasises "current hardware" more than backward compatibility?
I would say there's value in having two official manylinux flavors at once, for example manylinux2010 for maximum compatibility (it's already 8 years old as far as requirements go!) and manylinux2016 for recent systems compatibility. Later, manylinux2022 gets released as the "recent systems compatibility" standard and manylinux2016 becomes the "maximum compatibility" flavor. Regards Antoine.
I would say there's value in having two official manylinux flavors at once, for example manylinux2010 for maximum compatibility (it's already 8 years old as far as requirements go!) and manylinux2016 for recent systems compatibility. Later, manylinux2022 gets released as the "recent systems compatibility" standard and manylinux2016 becomes the "maximum compatibility" flavor.
That's an interesting proposition. Would pip be able to automatically select the most recent compatible wheel when two are available on PyPI? -- Olivier
On Tue, 18 Sep 2018 at 13:45, Olivier Grisel <olivier.grisel@ensta.org> wrote:
I would say there's value in having two official manylinux flavors at once, for example manylinux2010 for maximum compatibility (it's already 8 years old as far as requirements go!) and manylinux2016 for recent systems compatibility. Later, manylinux2022 gets released as the "recent systems compatibility" standard and manylinux2016 becomes the "maximum compatibility" flavor.
That's an interesting proposition.
Would pip be able to automatically select the most recent compatible wheel when two are available on PyPI?
Pip determines a list of supported tag combinations and takes the first one that matches. So it's not "automatic", in the sense of needing nothing to be done, rather someone would have to contribute code to pip that determined whether (for example) the platform supported manylinux2016, and if so add it to the list of supported tags ahead of manylinux2010. That code would need updating as newer standards like manylinux2022 were finalised. Paul
On Sep 18, 2018, at 8:44 AM, Olivier Grisel <olivier.grisel@ensta.org> wrote:
That's an interesting proposition.
Would pip be able to automatically select the most recent compatible wheel when two are available on PyPI?
Yes. Well “recent” isn’t the right way to describe it. Basically when pip is looking through the list of wheels for a given version that are all compatible with the current platform, it tried to select the most “specific” wheel for that platform (with the idea that the more specific the wheel is, the more likely it is to work and be performant etc). For the hypothetical of manylinux1, manylinux2010, manylinux2016, and manylinux2020, for a system that supported all of those, I don’t see any reason why we wouldn’t consider manylinux2020 to be more specific than manylinux2016, which is again more specific than 2010, and 1. One nice side effect of this case, is if we ever enable something like an “ubuntu X.Y” wheel, that it doesn’t mean that a project would have to target every possibly platform. They could do something like publish a manylinux wheel to cover a wide range of linux, but if a significant number of their users is on ubuntu 18.04, they could *also* publish an ubuntu_18_4 wheel that would take precedence. This even works correctly for pure Python fallbacks, for instance, if you have a C module that speeds up a pure Python module, you could publish a pure python wheel, and then compiled wheels for whatever platforms you targeted, and pip would prefer the wheel with compiled code over the pure python one.
On Tue, Sep 18, 2018, at 10:02 AM, Antoine Pitrou wrote:
Nathaniel Smith wrote:
It's naughty, you shouldn't do it, and the energy you put into making pseudo-manylinux1 wheels could probably be better put into making finishing up the manylinux2010 work – there's not that much to do.
Can you explain what's missing?
There's a meta-issue here for the topic: https://github.com/pypa/manylinux/issues/179 Various bits of code need to be told that manylinux2010 exists, and where necessary, have some understanding of what it means.
participants (9)
-
Antoine Pitrou
-
Donald Stufft
-
Joni Orponen
-
Nathaniel Smith
-
Nick Coghlan
-
Olivier Grisel
-
Paul Moore
-
Thomas Kluyver
-
Trishank Kuppusamy