
Hi all, Here's a first draft of a PEP for the manylinux1 platform tag mentioned earlier, posted for feedback. Really Robert McGibbon should get the main credit for this, since he wrote it, and also the docker image and the amazing auditwheel tool linked below, but he asked me to do the honors of posting it :-). BTW, if anyone wants to try this out, there are some test "manylinux1-compatible" wheels at https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired for PySide (i.e. Qt) and numpy (using openblas). They should be installable on any ordinary linux system with: pip install --no-index -f https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired $PKG (Note that this may require a reasonably up-to-date pip -- e.g. the one in Debian is too old, which confused me for a bit.) (How they were created: docker run -it quay.io/manylinux/manylinux bash; install conda because to get builds of Qt and OpenBLAS because I was too lazy to do it myself; pip wheel PySide / pip wheel numpy; auditwheel repair <the resulting wheels>, which copies in all the dependencies to make the wheels self-contained. Just proof-of-concept for now, but they seem to work.) ---- PEP: XXXX Title: A Platform Tag for Portable Linux Built Distributions Version: $Revision$ Last-Modified: $Date$ Author: Robert T. McGibbon <rmcgibbo@gmail.com>, Nathaniel J. Smith <njs@pobox.com> Status: Draft Type: Process Content-Type: text/x-rst Created: 19-Jan-2016 Post-History: 19-Jan-2016 Abstract ======== This PEP proposes the creation of a new platform tag for Python package built distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with external dependencies limited restricted to a standardized subset of the Linux kernel and core userspace ABI. It proposes that PyPI support uploading and distributing Wheels with this platform tag, and that ``pip`` support downloading and installing these packages on compatible platforms. Rationale ========= Currently, distribution of binary Python extensions for Windows and OS X is straightforward. Developers and packagers build wheels, which are assigned platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload these wheels to PyPI. Users can download and install these wheels using tools such as ``pip``. For Linux, the situation is much more delicate. In general, compiled Python extension modules built on one Linux distribution will not work on other Linux distributions, or even on the same Linux distribution with different system libraries installed. Build tools using PEP 425 platform tags [1]_ do not track information about the particular Linux distribution or installed system libraries, and instead assign all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of this ambiguity, there is no expectation that ``linux``-tagged built distributions compiled on one machine will work properly on another, and for this reason, PyPI has not permitted the uploading of wheels for Linux. It would be ideal if wheel packages could be compiled that would work on *any* linux system. But, because of the incredible diversity of Linux systems -- from PCs to Android to embedded systems with custom libcs -- this cannot be guaranteed in general. Instead, we define a standard subset of the kernel+core userspace ABI that, in practice, is compatible enough that packages conforming to this standard will work on *many* linux systems, including essentially all of the desktop and server distributions in common use. We know this because there are companies who have been distributing such widely-portable pre-compiled Python extension modules for Linux -- e.g. Enthought with Canopy [2]_ and Continuum Analytics with Anaconda [3]_. Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels. Key Causes of Inter-Linux Binary Incompatibility ================================================ To properly define a standard that will guarantee that wheel packages meeting this specification will operate on *many* linux platforms, it is necessary to understand the root causes which often prevent portability of pre-compiled binaries on Linux. The two key causes are dependencies on shared libraries which are not present on users' systems, and dependencies on particular versions of certain core libraries like ``glibc``. External Shared Libraries ------------------------- Most desktop and server linux distributions come with a system package manager (examples include ``APT`` on Debian-based systems, ``yum`` on ``RPM``-based systems, and ``pacman`` on Arch linux) that manages, among other responsibilities, the installation of shared libraries installed to system directories such as ``/usr/lib``. Most non-trivial Python extensions will depend on one or more of these shared libraries, and thus function properly only on systems where the user has the proper libraries (and the proper versions thereof), either installed using their package manager, or installed manually by setting certain environment variables such as ``LD_LIBRARY_PATH`` to notify the runtime linker of the location of the depended-upon shared libraries. Versioning of Core Shared Libraries ----------------------------------- Even if author or maintainers of a Python extension module with to use no external shared libraries, the modules will generally have a dynamic runtime dependency on the GNU C library, ``glibc``. While it is possible, statically linking ``glibc`` is usually a bad idea because of bloat, and because certain important C functions like ``dlopen()`` cannot be called from code that statically links ``glibc``. A runtime shared library dependency on a system-provided ``glibc`` is unavoidable in practice. The maintainers of the GNU C library follow a strict symbol versioning scheme for backward compatibility. This ensures that binaries compiled against an older version of ``glibc`` can run on systems that have a newer ``glibc``. The opposite is generally not true -- binaries compiled on newer Linux distributions tend to rely upon versioned functions in glibc that are not available on older systems. This generally prevents built distributions compiled on the latest Linux distributions from being portable. The ``manylinux1`` policy ========================= For these reasons, to achieve broad portability, Python wheels * should depend only on an extremely limited set of external shared libraries; and * should depend only on ``old`` symbol versions in those external shared libraries. The ``manylinux1`` policy thus encompasses a standard for what the permitted external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein. The permitted external shared libraries are: :: libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0 On Debian-based systems, these libraries are provided by the packages :: libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6 libxrender1 libice6 libsm6 libgl1-mesa-glx libglib2.0-0 On RPM-based systems, these libraries are provided by the packages :: ncurses libgcc libstdc++ glibc libXext libXrender libICE libSM mesa-libGL glib2 This list was compiled by checking the external shared library dependencies of the Canopy [1]_ and Anaconda [2]_ distributions, which both include a wide array of the most popular Python modules and have been confirmed in practice to work across a wide swath of Linux systems in the wild. For dependencies on externally-provided versioned symbols in the above shared libraries, the following symbol versions are permitted: :: GLIBC <= 2.5 CXXABI <= 3.4.8 GLIBCXX <= 3.4.9 GCC <= 4.2.0 These symbol versions were determined by inspecting the latest symbol version provided in the libraries distributed with CentOS 5, a Linux distribution released in April 2007. In practice, this means that Python wheels which conform to this policy should function on almost any linux distribution released after this date. Compilation and Tooling ======================= To support the compilation of wheels meeting the ``manylinux1`` standard, we provide initial drafts of two tools. The first is a Docker image based on CentOS 5.11, which is recommended as an easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_. Compiling on a more recently-released linux distribution will generally introduce dependencies on too-new versioned symbols. The image comes with a full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as well as the latest releases of Python and pip. The second tool is a command line executable called ``auditwheel`` [5]_. First, it inspects all of the ELF files inside a wheel to check for dependencies on versioned symbols or external shared libraries, and verifies conformance with the ``manylinux1`` policy. This includes the ability to add the new platform tag to conforming wheels. In addition, ``auditwheel`` has the ability to automatically modify wheels that depend on external shared libraries by copying those shared libraries from the system into the wheel itself, and modifying the appropriate RPATH entries such that these libraries will be picked up at runtime. This accomplishes a similar result as if the libraries had been statically linked without requiring changes to the build system. Neither of these tools are necessary to build wheels which conform with the ``manylinux1`` policy. Similar results can usually be achieved by statically linking external dependencies and/or using certain inline assembly constructs to instruct the linker to prefer older symbol versions, however these tricks can be quite esoteric. Platform Detection for Installers ================================= Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise. We know of three main sources of potential incompatibility that are likely to arise in practice: * A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile To handle the first two cases, we propose the following simple and reliable check: :: def have_glibc_version(major, minimum_minor): import ctypes process_namespace = ctypes.CDLL(None) try: gnu_get_libc_version = process_namespace.gnu_get_libc_version except AttributeError: # We are not linked to glibc. return False gnu_get_libc_version.restype = ctypes.c_char_p version_str = gnu_get_libc_version() # py2 / py3 compatibility: if not isinstance(version_str, str): version_str = version_str.decode("ascii") version = [int(piece) for piece in version_str.split(".")] assert len(version) == 2 if major != version[0]: return False if minimum_minor > version[1]: return False return True # CentOS 5 uses glibc 2.5. is_manylinux1_compatible = have_glibc_version(2, 5) To handle the third case, we propose the creation of a file ``/etc/python/compatibility.cfg`` in ConfigParser format, with sample contents: :: [manylinux1] compatible = true where the supported values for the ``manylinux1.compatible`` entry are the same as those supported by the ConfigParser ``getboolean`` method. The proposed logic for ``pip`` or related tools, then, is: 0) If ``distutils.util.get_platform()`` does not start with the string ``"linux"``, then assume the current system is not ``manylinux1`` compatible. 1) If ``/etc/python/compatibility.conf`` exists and contains a ``manylinux1`` key, then trust that. 2) Otherwise, if ``have_glibc_version(2, 5)`` returns true, then assume the current system can handle ``manylinux1`` wheels. 3) Otherwise, assume that the current system cannot handle ``manylinux1`` wheels. Security Implications ===================== One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography. ``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile. Rejected Alternatives ===================== One alternative would be to provide separate platform tags for each Linux distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``, ``debian_jessie``, etc. Nothing in this proposal rules out the possibility of adding such platform tags in the future, or of further extensions to wheel metadata that would allow wheels to declare dependencies on external system-installed packages. However, such extensions would require substantially more work than this proposal, and still might not be appreciated by package developers who would prefer not to have to maintain multiple build environments and build multiple wheels in order to cover all the common Linux distributions. Therefore we consider such proposals to be out-of-scope for this PEP. References ========== .. [1] PEP 425 -- Compatibility Tags for Built Distributions (https://www.python.org/dev/peps/pep-0425/) .. [2] Enthought Canopy Python Distribution (https://store.enthought.com/downloads/) .. [3] Continuum Analytics Anaconda Python Distribution (https://www.continuum.io/downloads) .. [4] manylinux1 docker image (https://quay.io/repository/manylinux/manylinux) .. [5] auditwheel (https://pypi.python.org/pypi/auditwheel) Copyright ========= This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Nathaniel J. Smith -- https://vorpus.org

FWIW, really excited to be seeing progress on this! *Randy Syring* Husband | Father | Redeemed Sinner /"For what does it profit a man to gain the whole world and forfeit his soul?" (Mark 8:36 ESV)/ On 01/20/2016 10:55 PM, Nathaniel Smith wrote:
Hi all,
Here's a first draft of a PEP for the manylinux1 platform tag mentioned earlier, posted for feedback. Really Robert McGibbon should get the main credit for this, since he wrote it, and also the docker image and the amazing auditwheel tool linked below, but he asked me to do the honors of posting it :-).
BTW, if anyone wants to try this out, there are some test "manylinux1-compatible" wheels at https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired for PySide (i.e. Qt) and numpy (using openblas). They should be installable on any ordinary linux system with: pip install --no-index -f https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired $PKG (Note that this may require a reasonably up-to-date pip -- e.g. the one in Debian is too old, which confused me for a bit.)
(How they were created: docker run -it quay.io/manylinux/manylinux bash; install conda because to get builds of Qt and OpenBLAS because I was too lazy to do it myself; pip wheel PySide / pip wheel numpy; auditwheel repair <the resulting wheels>, which copies in all the dependencies to make the wheels self-contained. Just proof-of-concept for now, but they seem to work.)
----
PEP: XXXX Title: A Platform Tag for Portable Linux Built Distributions Version: $Revision$ Last-Modified: $Date$ Author: Robert T. McGibbon <rmcgibbo@gmail.com>, Nathaniel J. Smith <njs@pobox.com> Status: Draft Type: Process Content-Type: text/x-rst Created: 19-Jan-2016 Post-History: 19-Jan-2016
Abstract ========
This PEP proposes the creation of a new platform tag for Python package built distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with external dependencies limited restricted to a standardized subset of the Linux kernel and core userspace ABI. It proposes that PyPI support uploading and distributing Wheels with this platform tag, and that ``pip`` support downloading and installing these packages on compatible platforms.
Rationale =========
Currently, distribution of binary Python extensions for Windows and OS X is straightforward. Developers and packagers build wheels, which are assigned platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload these wheels to PyPI. Users can download and install these wheels using tools such as ``pip``.
For Linux, the situation is much more delicate. In general, compiled Python extension modules built on one Linux distribution will not work on other Linux distributions, or even on the same Linux distribution with different system libraries installed.
Build tools using PEP 425 platform tags [1]_ do not track information about the particular Linux distribution or installed system libraries, and instead assign all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of this ambiguity, there is no expectation that ``linux``-tagged built distributions compiled on one machine will work properly on another, and for this reason, PyPI has not permitted the uploading of wheels for Linux.
It would be ideal if wheel packages could be compiled that would work on *any* linux system. But, because of the incredible diversity of Linux systems -- from PCs to Android to embedded systems with custom libcs -- this cannot be guaranteed in general.
Instead, we define a standard subset of the kernel+core userspace ABI that, in practice, is compatible enough that packages conforming to this standard will work on *many* linux systems, including essentially all of the desktop and server distributions in common use. We know this because there are companies who have been distributing such widely-portable pre-compiled Python extension modules for Linux -- e.g. Enthought with Canopy [2]_ and Continuum Analytics with Anaconda [3]_.
Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels.
Key Causes of Inter-Linux Binary Incompatibility ================================================
To properly define a standard that will guarantee that wheel packages meeting this specification will operate on *many* linux platforms, it is necessary to understand the root causes which often prevent portability of pre-compiled binaries on Linux. The two key causes are dependencies on shared libraries which are not present on users' systems, and dependencies on particular versions of certain core libraries like ``glibc``.
External Shared Libraries -------------------------
Most desktop and server linux distributions come with a system package manager (examples include ``APT`` on Debian-based systems, ``yum`` on ``RPM``-based systems, and ``pacman`` on Arch linux) that manages, among other responsibilities, the installation of shared libraries installed to system directories such as ``/usr/lib``. Most non-trivial Python extensions will depend on one or more of these shared libraries, and thus function properly only on systems where the user has the proper libraries (and the proper versions thereof), either installed using their package manager, or installed manually by setting certain environment variables such as ``LD_LIBRARY_PATH`` to notify the runtime linker of the location of the depended-upon shared libraries.
Versioning of Core Shared Libraries -----------------------------------
Even if author or maintainers of a Python extension module with to use no external shared libraries, the modules will generally have a dynamic runtime dependency on the GNU C library, ``glibc``. While it is possible, statically linking ``glibc`` is usually a bad idea because of bloat, and because certain important C functions like ``dlopen()`` cannot be called from code that statically links ``glibc``. A runtime shared library dependency on a system-provided ``glibc`` is unavoidable in practice.
The maintainers of the GNU C library follow a strict symbol versioning scheme for backward compatibility. This ensures that binaries compiled against an older version of ``glibc`` can run on systems that have a newer ``glibc``. The opposite is generally not true -- binaries compiled on newer Linux distributions tend to rely upon versioned functions in glibc that are not available on older systems.
This generally prevents built distributions compiled on the latest Linux distributions from being portable.
The ``manylinux1`` policy =========================
For these reasons, to achieve broad portability, Python wheels
* should depend only on an extremely limited set of external shared libraries; and * should depend only on ``old`` symbol versions in those external shared libraries.
The ``manylinux1`` policy thus encompasses a standard for what the permitted external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
On Debian-based systems, these libraries are provided by the packages ::
libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6 libxrender1 libice6 libsm6 libgl1-mesa-glx libglib2.0-0
On RPM-based systems, these libraries are provided by the packages ::
ncurses libgcc libstdc++ glibc libXext libXrender libICE libSM mesa-libGL glib2
This list was compiled by checking the external shared library dependencies of the Canopy [1]_ and Anaconda [2]_ distributions, which both include a wide array of the most popular Python modules and have been confirmed in practice to work across a wide swath of Linux systems in the wild.
For dependencies on externally-provided versioned symbols in the above shared libraries, the following symbol versions are permitted: ::
GLIBC <= 2.5 CXXABI <= 3.4.8 GLIBCXX <= 3.4.9 GCC <= 4.2.0
These symbol versions were determined by inspecting the latest symbol version provided in the libraries distributed with CentOS 5, a Linux distribution released in April 2007. In practice, this means that Python wheels which conform to this policy should function on almost any linux distribution released after this date.
Compilation and Tooling =======================
To support the compilation of wheels meeting the ``manylinux1`` standard, we provide initial drafts of two tools.
The first is a Docker image based on CentOS 5.11, which is recommended as an easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_. Compiling on a more recently-released linux distribution will generally introduce dependencies on too-new versioned symbols. The image comes with a full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as well as the latest releases of Python and pip.
The second tool is a command line executable called ``auditwheel`` [5]_. First, it inspects all of the ELF files inside a wheel to check for dependencies on versioned symbols or external shared libraries, and verifies conformance with the ``manylinux1`` policy. This includes the ability to add the new platform tag to conforming wheels.
In addition, ``auditwheel`` has the ability to automatically modify wheels that depend on external shared libraries by copying those shared libraries from the system into the wheel itself, and modifying the appropriate RPATH entries such that these libraries will be picked up at runtime. This accomplishes a similar result as if the libraries had been statically linked without requiring changes to the build system.
Neither of these tools are necessary to build wheels which conform with the ``manylinux1`` policy. Similar results can usually be achieved by statically linking external dependencies and/or using certain inline assembly constructs to instruct the linker to prefer older symbol versions, however these tricks can be quite esoteric.
Platform Detection for Installers =================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
To handle the first two cases, we propose the following simple and reliable check: ::
def have_glibc_version(major, minimum_minor): import ctypes
process_namespace = ctypes.CDLL(None) try: gnu_get_libc_version = process_namespace.gnu_get_libc_version except AttributeError: # We are not linked to glibc. return False
gnu_get_libc_version.restype = ctypes.c_char_p version_str = gnu_get_libc_version() # py2 / py3 compatibility: if not isinstance(version_str, str): version_str = version_str.decode("ascii")
version = [int(piece) for piece in version_str.split(".")] assert len(version) == 2 if major != version[0]: return False if minimum_minor > version[1]: return False return True
# CentOS 5 uses glibc 2.5. is_manylinux1_compatible = have_glibc_version(2, 5)
To handle the third case, we propose the creation of a file ``/etc/python/compatibility.cfg`` in ConfigParser format, with sample contents: ::
[manylinux1] compatible = true
where the supported values for the ``manylinux1.compatible`` entry are the same as those supported by the ConfigParser ``getboolean`` method.
The proposed logic for ``pip`` or related tools, then, is:
0) If ``distutils.util.get_platform()`` does not start with the string ``"linux"``, then assume the current system is not ``manylinux1`` compatible. 1) If ``/etc/python/compatibility.conf`` exists and contains a ``manylinux1`` key, then trust that. 2) Otherwise, if ``have_glibc_version(2, 5)`` returns true, then assume the current system can handle ``manylinux1`` wheels. 3) Otherwise, assume that the current system cannot handle ``manylinux1`` wheels.
Security Implications =====================
One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
Rejected Alternatives =====================
One alternative would be to provide separate platform tags for each Linux distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``, ``debian_jessie``, etc. Nothing in this proposal rules out the possibility of adding such platform tags in the future, or of further extensions to wheel metadata that would allow wheels to declare dependencies on external system-installed packages. However, such extensions would require substantially more work than this proposal, and still might not be appreciated by package developers who would prefer not to have to maintain multiple build environments and build multiple wheels in order to cover all the common Linux distributions. Therefore we consider such proposals to be out-of-scope for this PEP.
References ==========
.. [1] PEP 425 -- Compatibility Tags for Built Distributions (https://www.python.org/dev/peps/pep-0425/) .. [2] Enthought Canopy Python Distribution (https://store.enthought.com/downloads/) .. [3] Continuum Analytics Anaconda Python Distribution (https://www.continuum.io/downloads) .. [4] manylinux1 docker image (https://quay.io/repository/manylinux/manylinux) .. [5] auditwheel (https://pypi.python.org/pypi/auditwheel)
Copyright =========
This document has been placed into the public domain.
..
Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

On 21 January 2016 at 13:55, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
Here's a first draft of a PEP for the manylinux1 platform tag mentioned earlier, posted for feedback. Really Robert McGibbon should get the main credit for this, since he wrote it, and also the docker image and the amazing auditwheel tool linked below, but he asked me to do the honors of posting it :-).
Very nice! Do you have a link to the text file version of that? I can assign it a number and add it to the PEPs repo. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi Nick, The text version is here: https://raw.githubusercontent.com/manylinux/manylinux/master/pep-XXXX.rst -Robert On Wed, Jan 20, 2016 at 10:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 21 January 2016 at 13:55, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
Here's a first draft of a PEP for the manylinux1 platform tag mentioned earlier, posted for feedback. Really Robert McGibbon should get the main credit for this, since he wrote it, and also the docker image and the amazing auditwheel tool linked below, but he asked me to do the honors of posting it :-).
Very nice!
Do you have a link to the text file version of that? I can assign it a number and add it to the PEPs repo.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On 21 January 2016 at 16:39, Robert McGibbon <rmcgibbo@gmail.com> wrote:
Hi Nick,
The text version is here: https://raw.githubusercontent.com/manylinux/manylinux/master/pep-XXXX.rst
Thanks - this is now PEP 513: https://www.python.org/dev/peps/pep-0513/ (404 response caching on the site is being its usual annoying self, but that will sort itself out before too long) In addition to assigning the PEP number, I also set the BDFL-Delegate field (to me), fixed the PEP type (Process -> Informational) and set the Discussions-To header (distutils-sig), so if you could copy those changes back into your working copy, that would be great. In terms of the content, it all looks reasonable to me personally, but we'll wait and see what everyone else has to say (especially the PyPI and pip developers). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 21 January 2016 at 16:50, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 21 January 2016 at 16:39, Robert McGibbon <rmcgibbo@gmail.com> wrote:
Hi Nick,
The text version is here: https://raw.githubusercontent.com/manylinux/manylinux/master/pep-XXXX.rst
Thanks - this is now PEP 513: https://www.python.org/dev/peps/pep-0513/ (404 response caching on the site is being its usual annoying self, but that will sort itself out before too long)
Apparently this is the "between writing that and hitting send" variant of "before too long" :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Robert McGibbon: Thanks for writing up this PEP :-) Some comments below... On 21.01.2016 04:55, Nathaniel Smith wrote:
The ``manylinux1`` policy =========================
For these reasons, to achieve broad portability, Python wheels
* should depend only on an extremely limited set of external shared libraries; and * should depend only on ``old`` symbol versions in those external shared libraries.
The ``manylinux1`` policy thus encompasses a standard for what the permitted external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
The list is good start, but limiting the possible external references to only these libraries will make it impossible to have manylinux1 wheels which link against other, similarly old, but very common libraries, or alternatively rather new ones, which are then optionally included via subpackage instead of being mandatory. At eGenix we have been tackling this problem for years with our extensions and the approach that's been the most successful was to simply use Linux build systems which are at least 5 years old. In our case, that's openSUSE 11.3. I think a better approach is to use the above list to test for used library *versions* and then apply the tag based on the findings. If a package includes binaries which link to e.g. later libc.so versions, it would be rejected. If it includes other libraries not listed in the above listing, that's fine, as long as these libraries also comply to the version limitation. What I'm getting at here is that incompatibilities are not caused by libraries being absent on the system (the package simply won't load, but that's not due to the the package being incompatible to the platform, only due to the system lacking a few packages), but instead by having the packages use more recent versions of these system libraries.
Compilation and Tooling =======================
To support the compilation of wheels meeting the ``manylinux1`` standard, we provide initial drafts of two tools.
The first is a Docker image based on CentOS 5.11, which is recommended as an easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_. Compiling on a more recently-released linux distribution will generally introduce dependencies on too-new versioned symbols. The image comes with a full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as well as the latest releases of Python and pip.
The second tool is a command line executable called ``auditwheel`` [5]_. First, it inspects all of the ELF files inside a wheel to check for dependencies on versioned symbols or external shared libraries, and verifies conformance with the ``manylinux1`` policy. This includes the ability to add the new platform tag to conforming wheels.
In addition, ``auditwheel`` has the ability to automatically modify wheels that depend on external shared libraries by copying those shared libraries from the system into the wheel itself, and modifying the appropriate RPATH entries such that these libraries will be picked up at runtime. This accomplishes a similar result as if the libraries had been statically linked without requiring changes to the build system.
This approach has a few problems: * Libraries typically depend on a lot more context than just the code that is provided in the libraries file, e.g. config files, external resources, other libraries which are loaded on demand, etc. * By including the libraries in the wheel you are distributing the binary, which can lead to licensing problems, esp. with GPLed or LGPLed code.
Neither of these tools are necessary to build wheels which conform with the ``manylinux1`` policy. Similar results can usually be achieved by statically linking external dependencies and/or using certain inline assembly constructs to instruct the linker to prefer older symbol versions, however these tricks can be quite esoteric.
Static linking only helps in very few cases, where the context needed for the external library to work is minimal.
Platform Detection for Installers =================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
To handle the first two cases, we propose the following simple and reliable check: :: def have_glibc_version(major, minimum_minor): [...]
It would be better to use platform.libc_ver() for this.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
The OpenSSL ABI has been quite stable in recent years (unlike in the days of 0.9.7 and earlier). Since many libraries do link against OpenSSL (basically everything that uses network connections nowadays), using the fixed scheme outlined in the PEP would severely limited the usefulness. By using the version based approach, we'd not run into this problem and gain a lot more. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

After I posted the PEP link on social media, a friend of mine, Kyle Beauchamp, asked: "I wonder if there are speed and correctness implications to always reverting to the lowest common denominator of glibc." For speed, I believe there is some consequence, in that the maximum gcc version you can realistically use with glibc <= 2.5 is gcc 4.8.2, which is presumably somewhat (although not much) slower than the latest gcc release. As to correctness, it seems like a reasonable concern, and I don't know one way or the other. I thought maybe someone on the list would know. There are also potential compatibility issues in wheels that use certain instruction set extensions (SSE 4, AVX, etc) that might not be available on certain platforms. I think policies on this issue are better left up to individual packages than set at a PEP-wide level, but it also may be worth talking about. -Robert On Thu, Jan 21, 2016 at 1:03 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Robert McGibbon: Thanks for writing up this PEP :-)
Some comments below...
On 21.01.2016 04:55, Nathaniel Smith wrote:
The ``manylinux1`` policy =========================
For these reasons, to achieve broad portability, Python wheels
* should depend only on an extremely limited set of external shared libraries; and * should depend only on ``old`` symbol versions in those external shared libraries.
The ``manylinux1`` policy thus encompasses a standard for what the permitted external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
The list is good start, but limiting the possible external references to only these libraries will make it impossible to have manylinux1 wheels which link against other, similarly old, but very common libraries, or alternatively rather new ones, which are then optionally included via subpackage instead of being mandatory.
At eGenix we have been tackling this problem for years with our extensions and the approach that's been the most successful was to simply use Linux build systems which are at least 5 years old. In our case, that's openSUSE 11.3.
I think a better approach is to use the above list to test for used library *versions* and then apply the tag based on the findings.
If a package includes binaries which link to e.g. later libc.so versions, it would be rejected. If it includes other libraries not listed in the above listing, that's fine, as long as these libraries also comply to the version limitation.
What I'm getting at here is that incompatibilities are not caused by libraries being absent on the system (the package simply won't load, but that's not due to the the package being incompatible to the platform, only due to the system lacking a few packages), but instead by having the packages use more recent versions of these system libraries.
Compilation and Tooling =======================
To support the compilation of wheels meeting the ``manylinux1`` standard, we provide initial drafts of two tools.
The first is a Docker image based on CentOS 5.11, which is recommended as an easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_. Compiling on a more recently-released linux distribution will generally introduce dependencies on too-new versioned symbols. The image comes with a full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as well as the latest releases of Python and pip.
The second tool is a command line executable called ``auditwheel`` [5]_. First, it inspects all of the ELF files inside a wheel to check for dependencies on versioned symbols or external shared libraries, and verifies conformance with the ``manylinux1`` policy. This includes the ability to add the new platform tag to conforming wheels.
In addition, ``auditwheel`` has the ability to automatically modify wheels that depend on external shared libraries by copying those shared libraries from the system into the wheel itself, and modifying the appropriate RPATH entries such that these libraries will be picked up at runtime. This accomplishes a similar result as if the libraries had been statically linked without requiring changes to the build system.
This approach has a few problems:
* Libraries typically depend on a lot more context than just the code that is provided in the libraries file, e.g. config files, external resources, other libraries which are loaded on demand, etc.
* By including the libraries in the wheel you are distributing the binary, which can lead to licensing problems, esp. with GPLed or LGPLed code.
Neither of these tools are necessary to build wheels which conform with the ``manylinux1`` policy. Similar results can usually be achieved by statically linking external dependencies and/or using certain inline assembly constructs to instruct the linker to prefer older symbol versions, however these tricks can be quite esoteric.
Static linking only helps in very few cases, where the context needed for the external library to work is minimal.
Platform Detection for Installers =================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
To handle the first two cases, we propose the following simple and reliable check: :: def have_glibc_version(major, minimum_minor): [...]
It would be better to use platform.libc_ver() for this.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
The OpenSSL ABI has been quite stable in recent years (unlike in the days of 0.9.7 and earlier).
Since many libraries do link against OpenSSL (basically everything that uses network connections nowadays), using the fixed scheme outlined in the PEP would severely limited the usefulness.
By using the version based approach, we'd not run into this problem and gain a lot more.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Jan 21 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones. The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone: * for publishers, going from "no Linux wheels" to "Linux wheels if you have few external dependencies beyond glibc" is a big step up (it's enough for a Cython accelerator module, for example, or a cffi wrapper around a bundled library) * for end users, we need to nigh certain that wheels built this way will *just work* Even with a small starting list of libraries defined, we're going to end up with cases where the installed extension module will fail to load, and end users will have to figure out what dependencies are missing. The "external dependency specification" at https://github.com/pypa/interoperability-peps/pull/30 would let pip detect that at install time (rather the user finding out at runtime when the module fails to load), but that will still leave the end user to figure out how to get the external dependencies installed. If Donald can provide the list of "most downloaded wheel files" for other platforms, that could also be a useful guide as to how many source builds may potentially already be avoided through the draft "manylinux1" definition. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Jan 21, 2016 at 1:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone:
* for publishers, going from "no Linux wheels" to "Linux wheels if you have few external dependencies beyond glibc" is a big step up (it's enough for a Cython accelerator module, for example, or a cffi wrapper around a bundled library) * for end users, we need to nigh certain that wheels built this way will *just work*
In general, I see a tension between permissiveness and flexibility in the policy here, with good arguments on both sides. A restrictive policy (like the one we propose) will keep some wheels off PyPI that would work just fine on most Linux boxes. But it will also ensure that fewer broken packages are uploaded. In my opinion, the packaging system we have currently works pretty well. Adopting a loose policy could therefore be experienced as a regression for users who type ``pip install <package>` and receive a broken binary wheel. This is one of the reasons we thought that it would be safest to start small and work incrementally. -Robert

Another wrinkle: it also should be possible, I think, for wheels to include binaries that are linked against shared libraries that are provided by other wheels (that are part of the install_requires), but I haven't thought much about this. I know Nathaniel has been thinking about this type of thing for a separate BLAS wheel package that, for example, numpy could depend on. I assume that this type of thing should be kosher within the manylinux1 policy, although the text of the draft PEP does not definitely address it. -Robert

On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages. It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either). IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK. pip could then warn about missing dependencies in the installed packages. Another detail we have found when external dependencies is that some platforms use different names for the libraries, e.g. RedHat has a tendency to use non-standard OpenSSL library names (/lib64/libssl.so.10 instead of the more common libssl.so.1.0.0).
The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone:
Sure, but if we get the concept wrong, it'll be difficult to switch later on and since there will always be libs not in the set, we'll need to address this in some way.
* for publishers, going from "no Linux wheels" to "Linux wheels if you have few external dependencies beyond glibc" is a big step up (it's enough for a Cython accelerator module, for example, or a cffi wrapper around a bundled library) * for end users, we need to nigh certain that wheels built this way will *just work*
Even with a small starting list of libraries defined, we're going to end up with cases where the installed extension module will fail to load, and end users will have to figure out what dependencies are missing. The "external dependency specification" at https://github.com/pypa/interoperability-peps/pull/30 would let pip detect that at install time (rather the user finding out at runtime when the module fails to load), but that will still leave the end user to figure out how to get the external dependencies installed.
If Donald can provide the list of "most downloaded wheel files" for other platforms, that could also be a useful guide as to how many source builds may potentially already be avoided through the draft "manylinux1" definition.
I still believe that installers are in a better position to decide which binary files to install based on the findings on the installation system. This is why we are using a more flexible tag system in our prebuilt format: http://www.egenix.com/library/presentations/PyCon-UK-2014-Python-Web-Install... In essence, the installer knows which files are available and can then analyze the installation system to pick the right binary. Tags can be added as necessary to address all the different dimensions that need testing, e.g. whether the binary runs on a Raspi2 or only a Raspi1. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 21 January 2016 at 20:05, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach.
I still don't really understand what you mean by "testing the versions of a set of libraries", but if you have the time available to propose a competing PEP, that always leads to a stronger result than when we only have only proposed approach to consider. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 21.01.2016 11:11, Nick Coghlan wrote:
On 21 January 2016 at 20:05, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach.
I still don't really understand what you mean by "testing the versions of a set of libraries", but if you have the time available to propose a competing PEP, that always leads to a stronger result than when we only have only proposed approach to consider.
I think the PEP is fine as it is, just the restriction to test for the library file names is something that would need to be changed to implement the different approach: """ For these reasons, we define a set of library versions that are supported by a wide range of Linux distributions. We therefore pick library versions which have been around for at least 5 years. When using these external libraries, Python wheels should only depend on library versions listed in the section below. Python wheels are free to depend on additional libraries not included in this set, however, care should be taken that these additional libraries do not depend on later versions of the listed libraries, e.g. OpenSSL libraries compiled against the C library versions listed below. The ``manylinux1`` policy thus encompasses a standard for which versions of these external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein. Future versions of the manylinux policy may include more libraries, or move on to later versions. The permitted external shared libraries and versions for ``manylinux1``are: :: libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 ... """ This will still lead to cases where a package doesn't work because of missing system packages, but at least they won't fail due to mismatch in basic C library versions, which is the most problematic case for users. The PEP will also have to address to problems introduced by versioned symbols in more recent Linux shared libs: even though the library file names have not changed, they may well include different support levels for the various APIs, e.g. glibc 2.1, 2.2, 2.3, etc. For our binaries, we have chosen to use a system where this versioning has not yet been enabled for system libs. We did this because we found using a library compiled against a versioned lib on a system which comes with an unversioned lib causes warnings to be issued. For openSUSE the change was applied between the 11.3 and 11.4 releases. Some references which show case the problem: - http://stackoverflow.com/questions/137773/what-does-the-no-version-informati... - http://superuser.com/questions/305055/how-to-diagnosis-and-resolve-usr-lib64... - http://forums.opensuse.org/english/get-technical-help-here/applications/4665... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users. (I posted a link above to some actual working, compliant pyside and numpy packages, which we picked for testing because they're particular interesting/difficult packages that need to interface to external libraries.) Yes, various extra tricks are needed to get things working on top of this base, including strategies for shipping libraries that are not in the baseline set, but these are problems that can be solved on a project-by-project basis, and don't need a PEP. [...]
Another detail we have found when external dependencies is that some platforms use different names for the libraries, e.g. RedHat has a tendency to use non-standard OpenSSL library names (/lib64/libssl.so.10 instead of the more common libssl.so.1.0.0).
The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone:
Sure, but if we get the concept wrong, it'll be difficult to switch later on and since there will always be libs not in the set, we'll need to address this in some way.
There's no lock-in here -- any alternative approach just needs its own platform tag. Pypi and pip can easily support multiple such tags at the same time, if more sophisticated proposals arise in the future. In the mean time, for packagers targeting manylinux is at least as easy as targeting windows (which also provides very few libraries "for free"). -n

On 21.01.2016 17:13, Nathaniel Smith wrote:
On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users. (I posted a link above to some actual working, compliant pyside and numpy packages, which we picked for testing because they're particular interesting/difficult packages that need to interface to external libraries.) Yes, various extra tricks are needed to get things working on top of this base, including strategies for shipping libraries that are not in the baseline set, but these are problems that can be solved on a project-by-project basis, and don't need a PEP.
And that's the problem: The set is limited to the needs of the scientific community and there to the users of one or two distributions only. It doesn't address needs of others that e.g. use Qt or GTK as basis for GUIs, people using OpenSSL for networking, people using ImageMagick for processing images, or type libs for type setting, or sound libs for doing sound processing, codec libs for video processing, etc. The idea to include the needed share libs in the wheel goes completely against the idea of relying on a system vendor to provide updates and security fixes. In some cases, this may be reasonable, but as design approach, it's not a good idea.
[...]
Another detail we have found when external dependencies is that some platforms use different names for the libraries, e.g. RedHat has a tendency to use non-standard OpenSSL library names (/lib64/libssl.so.10 instead of the more common libssl.so.1.0.0).
The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone:
Sure, but if we get the concept wrong, it'll be difficult to switch later on and since there will always be libs not in the set, we'll need to address this in some way.
There's no lock-in here -- any alternative approach just needs its own platform tag. Pypi and pip can easily support multiple such tags at the same time, if more sophisticated proposals arise in the future. In the mean time, for packagers targeting manylinux is at least as easy as targeting windows (which also provides very few libraries "for free").
Yes, there's no lock-in, there's lock-out :-) We'd simply not allow people who have other requirements to upload Linux wheels to PyPI and that's not really acceptable. Right now, no one can upload Linux wheels, so that a fair setup. Using an approach where every single group first has to write a PEP, get it accepted and have PyPI and pip patched before they can upload wheels to PyPI does not read like a community friendly approach. I also can't imagine that we really want proliferation of "linux" tags for various purposes or even for various versions of library catalogs. What we need is a system that provides a few dimensions for various system specific differences (e.g. bitness, architecture) and a recommendation for library versions of a few very basic libraries to use when compiling for those systems. I believe the PEP is almost there, it just needs to use a slightly different concept, since limiting the set of allowed libraries does not provide the basis of an open system which PyPI/pip/wheels need to be. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Thu, Jan 21, 2016 at 8:53 AM, M.-A. Lemburg <mal@egenix.com> wrote:
[...] What we need is a system that provides a few dimensions for various system specific differences (e.g. bitness, architecture) and a recommendation for library versions of a few very basic libraries to use when compiling for those systems.
I believe the PEP is almost there, it just needs to use a slightly different concept, since limiting the set of allowed libraries does not provide the basis of an open system which PyPI/pip/wheels need to be.
Sorry, I still don't think I quite understand. Is your position essentially that we should allow wheels to link against any system library included in (for example) stock openSUSE 11.5? Or that we should allow wheels to link against any library that can be installed into openSUSE 11.5 using `yum install <library>`? -Robert

On Jan 21, 2016 8:53 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
[...]
And that's the problem: The set is limited to the needs of the scientific community and there to the users of one or two distributions only.
It doesn't address needs of others that e.g. use Qt or GTK as basis for GUIs, people using OpenSSL for networking, people using ImageMagick for processing images, or type libs for type setting, or sound libs for doing sound processing, codec libs for video processing, etc.
I've pointed out several times now that our first test package was Qt bindings, and Glyph told us last week that this proposal is exactly how the cryptography package wants to handle their openssl dependency: https://www.mail-archive.com/distutils-sig@python.org/msg23506.html So this paragraph is just you making stuff up. Is manylinux1 the perfect panacea for every package? Probably not. In particular it's great for popular cross platform packages, because it works now and means they can basically reuse the work that they're already doing to make static windows and OSX wheels; it's less optimal for smaller Linux-specific packages that might prefer to take more than of Linux's unique package management functionality and only care about targeting one or two distros.
The idea to include the needed share libs in the wheel goes completely against the idea of relying on a system vendor to provide updates and security fixes. In some cases, this may be reasonable, but as design approach, it's not a good idea.
[...]
Another detail we have found when external dependencies is that some platforms use different names for the libraries, e.g. RedHat has a tendency to use non-standard OpenSSL library names (/lib64/libssl.so.10 instead of the more common libssl.so.1.0.0).
The key is that we only have one chance to make a good first impression with binary Linux wheel support on PyPI, and we want that to be positive for everyone:
Sure, but if we get the concept wrong, it'll be difficult to switch later on and since there will always be libs not in the set, we'll need to address this in some way.
There's no lock-in here -- any alternative approach just needs its own platform tag. Pypi and pip can easily support multiple such tags at the same time, if more sophisticated proposals arise in the future. In the mean time, for packagers targeting manylinux is at least as easy as targeting windows (which also provides very few libraries "for free").
Yes, there's no lock-in, there's lock-out :-)
We'd simply not allow people who have other requirements to upload Linux wheels to PyPI and that's not really acceptable.
Right now, no one can upload Linux wheels, so that a fair setup.
The fairness that I'm more worried about is that right now Windows and OSX users get wheels, and Linux users don't. Feature parity across these platforms isn't everything, but it's a good start.
Using an approach where every single group first has to write a PEP, get it accepted and have PyPI and pip patched before they can upload wheels to PyPI does not read like a community friendly approach.
I also can't imagine that we really want proliferation of "linux" tags for various purposes or even for various versions of library catalogs.
What we need is a system that provides a few dimensions for various system specific differences (e.g. bitness, architecture) and a recommendation for library versions of a few very basic libraries to use when compiling for those systems.
I believe the PEP is almost there, it just needs to use a slightly different concept, since limiting the set of allowed libraries does not provide the basis of an open system which PyPI/pip/wheels need to be.
Like Nick, I'm still not sure what this "slightly different concept" you keep referring to is. -n

On Thu, Jan 21, 2016 at 11:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Glyph told us last week that this proposal is exactly how the cryptography package wants to handle their openssl dependency: https://www.mail-archive.com/distutils-sig@python.org/msg23506.html
well, SSL is a pretty unique case -- there's one where controlling the version of the lib, and having it be recent it critical.
We will have issues with all sorts of other "Pretty common, but can't count on it" libs.
Is manylinux1 the perfect panacea for every package? Probably not. In particular it's great for popular cross platform packages, because it works now and means they can basically reuse the work that
they're already doing to make static windows and OSX wheels;
except that static linking is a pain on LInux -- the toolchain really doesn't want you to do that :-) It's also not part of the culture. Windows is "working" because of Chris Gohlke's heroic efforts. OS-X is kind-of sort of working, because of Matthew Brett's also heroic efforts. But Anaconda and Canopy exist, and are popular, for a reason -- they solve a very real problem, and many linux is only solving a very small part of that problem -- the easy part. Maybe there will be a Gohlke-like figure that will step up and build statically linked wheels for all sorts of stuff -- but is that the end-game we want anyway? everyting statically linked? One plus -- with Docker and CI systems, it's getting pretty easy to set up a build sandbox that only has the manylinux libs on it -- so not too hard to automate and test your builds....
The idea to include the needed share libs in the wheel
goes completely against the idea of relying on a system vendor to provide updates and security fixes. In some cases, this may be reasonable, but as design approach, it's not a good idea.
Is this any different than static linking -- probably not. And that's pretty much what I mean by the culture os dynamic linking on Linux. On Windows, dll hell is such that we've all accepted that we're going to need to statically link and provide dlls. On OS-X, at least the base system is pretty well defined -- though I sure wish they'd supply more of what really should be basic libs. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Jan 21, 2016 11:55 AM, "Chris Barker" <chris.barker@noaa.gov> wrote:
On Thu, Jan 21, 2016 at 11:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Glyph told us last week that this proposal is exactly how the
cryptography package wants to handle their openssl dependency: https://www.mail-archive.com/distutils-sig@python.org/msg23506.html
well, SSL is a pretty unique case -- there's one where controlling the
version of the lib, and having it be recent it critical.
We will have issues with all sorts of other "Pretty common, but can't
Is manylinux1 the perfect panacea for every package? Probably not. In
they're already doing to make static windows and OSX wheels;
except that static linking is a pain on LInux -- the toolchain really doesn't want you to do that :-) It's also not part of the culture.
Windows is "working" because of Chris Gohlke's heroic efforts.
OS-X is kind-of sort of working, because of Matthew Brett's also heroic efforts.
But Anaconda and Canopy exist, and are popular, for a reason -- they solve a very real problem, and many linux is only solving a very small part of that problem -- the easy part.
Maybe there will be a Gohlke-like figure that will step up and build statically linked wheels for all sorts of stuff -- but is that the end-game we want anyway? everyting statically linked?
One plus -- with Docker and CI systems, it's getting pretty easy to set up a build sandbox that only has the manylinux libs on it -- so not too hard to automate and test your builds....
The idea to include the needed share libs in the wheel goes completely against the idea of relying on a system vendor to provide updates and security fixes. In some cases, this may be reasonable, but as design approach, it's not a good idea.
Is this any different than static linking -- probably not. And that's
count on it" libs. particular it's great for popular cross platform packages, because it works now and means they can basically reuse the work that pretty much what I mean by the culture os dynamic linking on Linux. The difference between static linking and vendoring the shared libraries into the wheel using ``auditwheel repair`` is essentially just that the second option is much easier because it doesn't require modifications to the build system. Other than that, they're about the same. This is why Nathaniel was able to make a PySide wheel in 5 minutes. I don't think heroic efforts are really required.

On 21 January 2016 at 19:37, Nathaniel Smith <njs@pobox.com> wrote:
Right now, no one can upload Linux wheels, so that a fair setup.
The fairness that I'm more worried about is that right now Windows and OSX users get wheels, and Linux users don't. Feature parity across these platforms isn't everything, but it's a good start.
100% agreed. I don't use Linux, but as a Windows user, the fact that Linux users are excluded from using wheels saddens me because it means that without support for the large base of Linux users we still don't have a good binary distribution story. The story isn't perfect on Windows or OSX either. Let's not let a quest for perfection stand in the way of helping people get stuff done.
Using an approach where every single group first has to write a PEP, get it accepted and have PyPI and pip patched before they can upload wheels to PyPI does not read like a community friendly approach.
The valid Linux tags are defined in a pep, and coded in pip. But experience has shown that those tags are not a usable solution. Fixing that *requires* a PEP and a change to pip. There's no way round that. But equally, there's no reason we can't have more than one PEP/change. Just because someone has done the work and offered a PEP doesn't preclude anyone else doing so to. Contrariwise, if Nathaniel's PEP wasn't around, that wouldn't stop anyone with an alternative proposal from having to write a PEP. So I'm not sure what you're saying here. That the PEP process in general is not community friendly? That writing a PEP that turns out to be insufficient in one small area is somehow a huge issue for the community? That the wheel spec should never have been subject to the PEP process? That we should let people upload wheels with the insufficiently precise "linux" tag to PyPI and let the users sort out the resulting mess? OTOH, if you're suggesting that people might be put off proposing alternatives to the manylinux suggestion because of the grief Nathaniel is getting over what is (IMO) a practical, well-researched, and field tested suggestion, then you're possibly right. But the easiest way to fix that is to accept that it's a good proposal (possibly only an initial step, but that's fine) and approve it, rather than requiring it to be perfect (as opposed to merely significantly better than what we now have). FWIW, the proposal has a solid +1 from me. Paul

On 22 January 2016 at 02:53, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 17:13, Nathaniel Smith wrote:
On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users. (I posted a link above to some actual working, compliant pyside and numpy packages, which we picked for testing because they're particular interesting/difficult packages that need to interface to external libraries.) Yes, various extra tricks are needed to get things working on top of this base, including strategies for shipping libraries that are not in the baseline set, but these are problems that can be solved on a project-by-project basis, and don't need a PEP.
And that's the problem: The set is limited to the needs of the scientific community and there to the users of one or two distributions only.
It doesn't address needs of others that e.g. use Qt or GTK as basis for GUIs, people using OpenSSL for networking, people using ImageMagick for processing images, or type libs for type setting, or sound libs for doing sound processing, codec libs for video processing, etc.
This is fine - at the moment *everyone* is locked out from publishing Linux wheels to PyPI, so I'm entirely OK with biting off incremental chunks that meet the needs of different sections of the community, rather than trying to maintain an ever-expanding one-size-fits-all platform definition. However, it does suggest a possible alternative approach to naming these compatibility subsets: what if the name of this particular platform compatibility tag was something like "linux-sciabi1", rather than "manylinux1"? That way, if someone later wanted to propose "linux-guiabi1" or "linux-audioabi1" or "linux-videoabi1", that could be done. The auditwheel utility is already designed to support this "multiple compatibility policy" approach, and the idea of using Docker-based build environments also lends itself well to that model. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Jan 21, 2016 at 7:32 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
However, it does suggest a possible alternative approach to naming these compatibility subsets: what if the name of this particular platform compatibility tag was something like "linux-sciabi1", rather than "manylinux1"?
That's an interesting idea, but I personally don't see the manylinux1 list as particularly "scientific". If anything, I'd call it "minimal".
That way, if someone later wanted to propose "linux-guiabi1" or "linux-audioabi1" or "linux-videoabi1", that could be done.
This would be something, but if we want to have Linux binary wheels that tightly integrate with system libraries for certain use cases, the *really *valuable thing would be https://github.com/pypa/interoperability-peps/pull/30/files, more so than specific ABI tags, IMO. -Robert

Hi, On Thu, Jan 21, 2016 at 7:45 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Thu, Jan 21, 2016 at 7:32 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
However, it does suggest a possible alternative approach to naming these compatibility subsets: what if the name of this particular platform compatibility tag was something like "linux-sciabi1", rather than "manylinux1"?
That's an interesting idea, but I personally don't see the manylinux1 list as particularly "scientific". If anything, I'd call it "minimal".
Yes, I agree, I don't think 'linux-sciabi1" would differentiate this from other ways of building wheels. For example, I can't see why this wouldn't be a perfectly reasonable way to proceed for someone doing audio or video. The difference that "manylinux" was designed to capture is the idea of having a single set of wheels for many versions of Linux, rather than wheels specific to particular distributions or packaged versions of external libraries. Cheers, Matthew

On 22 January 2016 at 17:04, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi, On Thu, Jan 21, 2016 at 7:45 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Thu, Jan 21, 2016 at 7:32 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
However, it does suggest a possible alternative approach to naming these compatibility subsets: what if the name of this particular platform compatibility tag was something like "linux-sciabi1", rather than "manylinux1"?
That's an interesting idea, but I personally don't see the manylinux1 list as particularly "scientific". If anything, I'd call it "minimal".
Yes, I agree, I don't think 'linux-sciabi1" would differentiate this from other ways of building wheels. For example, I can't see why this wouldn't be a perfectly reasonable way to proceed for someone doing audio or video. The difference that "manylinux" was designed to capture is the idea of having a single set of wheels for many versions of Linux, rather than wheels specific to particular distributions or packaged versions of external libraries.
Yeah, it was just an idea to potentially address MAL's concerns regarding scope. However, I think the other replies to the thread have adequately addressed that, and we can continue deferring the question of scope increases to manylinux2 after seeing how far the current list and "auditwheel repair" can get us. The PEP should also be explicit that this does reintroduce the bundling problem that distro unbundling policies were designed to address, but: 1. In these days of automated continuous intregration & deployment pipelines, publishing new versions and updating dependencies is easier than it was when those policies were defined 2. Folks remain free to use "--no-binary" if they want to force local builds rather than using pre-built wheel files 3. The popularity of container based deployment and "immutable infrastructure" models involve substantial bundling at the application layer anyway 4. This PEP doesn't rule out the idea of offering more targeted binaries for particular Linux distributions Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 22 January 2016 at 07:04, Matthew Brett <matthew.brett@gmail.com> wrote:
That's an interesting idea, but I personally don't see the manylinux1 list as particularly "scientific". If anything, I'd call it "minimal".
Yes, I agree, I don't think 'linux-sciabi1" would differentiate this from other ways of building wheels. For example, I can't see why this wouldn't be a perfectly reasonable way to proceed for someone doing audio or video. The difference that "manylinux" was designed to capture is the idea of having a single set of wheels for many versions of Linux, rather than wheels specific to particular distributions or packaged versions of external libraries.
Experience with Christoph Gohlke's binary distributions on Windows suggests that a significant majority of non-scientific uses are perfectly well served by the sort of package list that scientific users would need. And I suspect that not all Enthought/Anaconda users are scientists, either. So I'd rather that the tag was based on capability rather than community / intended use. On that basis, "linux-minimal1" sounds fine to me. Paul

On 21.01.2016 17:13, Nathaniel Smith wrote:
On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users.
so this is x86_64-linux-gnu. Any other architectures? Any reason to choose gcc 4.8.2 which is known for it's defects? This whole thing looks like an Anaconda marketing PEP ... Matthias

On Thu, Jan 21, 2016 at 11:54 AM, Matthias Klose <doko@ubuntu.com> wrote:
On 21.01.2016 17:13, Nathaniel Smith wrote:
On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users.
so this is x86_64-linux-gnu. Any other architectures?
That's by far the dominant architecture for Linux workstations and servers, so that's what we've focused on, yeah. I assume that mainstream glibc-using distros on x86-32 and ARM are similar; unless someone wants to go do the research somehow then I think the simplest way forward is to proceed on the assumption that the same spec will work, and then fix it up if/when it turns out to break.
Any reason to choose gcc 4.8.2 which is known for it's defects?
It's the most recent version of gcc that's able to target CentOS 5 (thanks to RH's backporting efforts in their "devtoolset" releases), and CentOS 5 is the target baseline that's currently used by approximately everyone who distributes universal linux binaries (google e.g. "holy build box").
This whole thing looks like an Anaconda marketing PEP ...
Anaconda's main selling point is that they provide binaries that "just work", and pip doesn't. Not sure how improving pip to be a stronger competitor to Anaconda is Anaconda marketing. -n -- Nathaniel J. Smith -- https://vorpus.org

On Jan 21, 2016, at 3:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Thu, Jan 21, 2016 at 11:54 AM, Matthias Klose <doko@ubuntu.com> wrote:
On 21.01.2016 17:13, Nathaniel Smith wrote:
On Jan 21, 2016 2:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
The list of allowed libraries is exactly the same list of libraries as are required by the Anaconda python distribution, so we *know* that it works for about a hundred different python packages, including lots of tricky ones (the whole scientific stack), and had been tested by tens or hundreds of thousands of users.
so this is x86_64-linux-gnu. Any other architectures?
That's by far the dominant architecture for Linux workstations and servers, so that's what we've focused on, yeah. I assume that mainstream glibc-using distros on x86-32 and ARM are similar; unless someone wants to go do the research somehow then I think the simplest way forward is to proceed on the assumption that the same spec will work, and then fix it up if/when it turns out to break.
Numbers! Here’s the # of downloads from PyPI in the last week or so that we can identify as a linux machine using pip and what the value of their platform.machine() is: https://caremad.io/s/SIxbkCB82C/ ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Hi, On Thu, Jan 21, 2016 at 2:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK.
Please forgive my slowness, but I don't understand exactly what you mean. Can you give a specific example? Say my package depends on libpng. Call the machine I'm installing on the client machine. Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if if the client does have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng? If you do mean that, how do you want the PEP changed? Best, Matthew

On Thu, Jan 21, 2016 at 11:05 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Thu, Jan 21, 2016 at 2:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK.
Please forgive my slowness, but I don't understand exactly what you mean. Can you give a specific example?
Say my package depends on libpng.
Call the machine I'm installing on the client machine.
Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if if the client does have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng?
Sorry, slowness any typos - corrected: Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if the client does _not_ have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng? Best again, Matthew

On 21.01.2016 20:05, Matthew Brett wrote:
Hi,
On Thu, Jan 21, 2016 at 2:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK.
Please forgive my slowness, but I don't understand exactly what you mean. Can you give a specific example?
Say my package depends on libpng.
Call the machine I'm installing on the client machine.
Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if if the client does have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng?
If you do mean that, how do you want the PEP changed?
I already posted a change proposal earlier on in the thread. I'll repeat it here (with a minor enhancements): """ The ``manylinux1`` policy ========================= For these reasons, we define a set of library versions that are supported by a wide range of Linux distributions. We therefore pick library versions which have been around for at least 5 years. When using these external libraries, Python wheels should only depend on library versions listed in the section below. Python wheels are free to depend on additional libraries not included in this set, however, care should be taken that these additional libraries do not depend on later versions of the listed libraries, e.g. OpenSSL libraries compiled against the C library versions listed below. The ``manylinux1`` policy thus encompasses a standard for which versions of these external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein. Future versions of the manylinux policy may include more libraries, or move on to later versions. The permitted external shared libraries versions for ``manylinux1``are: :: libgcc_s.so.1 libstdc++.so.6 ... only include the basic set of libs, no GUI or curses ... """ The idea is to not pin down the set of usable external libraries, but instead pin down a set of versions for the most important libraries wheels will depend on and then let the wheels use other external libraries as necessary without version checks. In more details: If you want a wheel to run on many Linux distributions, you have to make sure that the basic lib C and a few other utility libraries are available and compatible with the ones you used to build the wheel. This can be addressed by defining a set of important libraries and corresponding versions. You do not have to limit the overall set of usable libraries for this, since less commonly used libraries will usually have to be installed separately anyway. For example, if a package needs a specific version of libpng, the package author can document this and the user can then make sure to install that particular version. The PEP should only be concerned with the basic set of libraries you typically need for a wheel, not any of the less common ones. The X11 libs for example do not have to be version pinned for the manylinux tag, since they are not essential for the vast majority of Python packages (and here I'm talking about the thousands of packages on PyPI, not the few hundred mentioned earlier in the thread, which are covered by Anaconda and Canopy). By defining "manylinux1" in such a way you get: * broad compatibility of wheel files on Linux * full flexibility of wheels interfacing or wrapping to other external libraries not covered in the PEP * no lock-out of package authors who would like to push wheel files for their packages to PyPI, but happen to use libraries not in the predefined list of the original draft PEP I left out the other details I mentioned (symbol versioning and dealing with different architectures) to focus on pinning libraries vs. pinning versions for now. Later on, we'd have to apply a similar strategy to other platforms as well, e.g. *BSD, AIX, Solaris, etc. Since we're targeting Linux, it may be helpful to base the list of libraries and versions on the Linux Standard Base (LSB), since one of the main ideas behind the LSB is binary compatibility between distributions: http://refspecs.linuxfoundation.org/lsb.shtml If people on this list still can't see the benefit of just pinning down versions of specific library files for "manylinux1" over limiting the set of allowed libraries and forcing people to embed any other libs in the wheel files, I guess I'll just have to write up a competing or additional tag PEP to enable all package authors to publish wheels for Linux on PyPI, but this won't happen until February. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 22 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Fri, Jan 22, 2016 at 1:33 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 20:05, Matthew Brett wrote:
Hi,
On Thu, Jan 21, 2016 at 2:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK.
Please forgive my slowness, but I don't understand exactly what you mean. Can you give a specific example?
Say my package depends on libpng.
Call the machine I'm installing on the client machine.
Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if if the client does have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng?
If you do mean that, how do you want the PEP changed?
I already posted a change proposal earlier on in the thread. I'll repeat it here (with a minor enhancements):
Okay, I think I get it now. I'll try to repeat back to summarize and see if I have understood your proposal correctly: In the PEP 513 "manylinux1" approach, when users do 'pip install foo', then one of three things happens: 1) they get a working foo and are immediately good-to-go, or 2) pip says "I'm sorry, there's no compatible wheel", or 3) something else happens, in which case this is a bug, and the spec provides some framework to help us determine whether this is a bug in the wheel, a bug in pip, or a bug in the spec. In your approach, users do 'pip install foo', and then pip installs the wheel, and then when they try to use the wheel they get an error message from the dynamic linker about missing libraries, and then the user has to read the docs or squint at these error messages in order to figure out what set of apt-get / yum / pacman / ... commands they need to run in order to make foo work. (And possibly there is no such combination of commands that will actually work, because e.g. the wheel was linked against Debian's version of libbar.so.7 and Fedora's version of libbar.so.7 turns out to have an incompatible ABI, or Fedora simply doesn't provide a libbar.so.7 package at all.) I won't express any opinion on your alternative PEP with its own platform tag without reading it, but we're not going to change PEP 513 to work this way.
* no lock-out of package authors who would like to push wheel files for their packages to PyPI, but happen to use libraries not in the predefined list of the original draft PEP
https://mail.python.org/pipermail/distutils-sig/2016-January/028050.html -n -- Nathaniel J. Smith -- https://vorpus.org

On 22.01.2016 11:03, Nathaniel Smith wrote:
On Fri, Jan 22, 2016 at 1:33 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 20:05, Matthew Brett wrote:
Hi,
On Thu, Jan 21, 2016 at 2:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 21.01.2016 10:31, Nick Coghlan wrote:
On 21 January 2016 at 19:03, M.-A. Lemburg <mal@egenix.com> wrote:
By using the version based approach, we'd not run into this problem and gain a lot more.
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
My argument is that the file based approach taken by the PEP is too limiting to actually make things work for a large set of Python packages.
It will basically only work for packages that do not interface to other external libraries (except for the few cases listed in the PEP, e.g. X11, GL, which aren't always installed or available either).
IMO, testing the versions of a set of libraries is a safer approach. It's perfectly fine to have a few dependencies not work in a module because an optional system package is not installed, e.g. say a package comes with UIs written in Qt and one in GTK.
Please forgive my slowness, but I don't understand exactly what you mean. Can you give a specific example?
Say my package depends on libpng.
Call the machine I'm installing on the client machine.
Are you saying that, when I build a wheel, I should specify to the wheel what versions of libpng I can tolerate on the the client machine, and if if the client does have a compatible version, then pip should raise an error, perhaps with a useful message about how to get libpng?
If you do mean that, how do you want the PEP changed?
I already posted a change proposal earlier on in the thread. I'll repeat it here (with a minor enhancements):
Okay, I think I get it now. I'll try to repeat back to summarize and see if I have understood your proposal correctly:
In the PEP 513 "manylinux1" approach, when users do 'pip install foo', then one of three things happens: 1) they get a working foo and are immediately good-to-go, or 2) pip says "I'm sorry, there's no compatible wheel", or 3) something else happens, in which case this is a bug, and the spec provides some framework to help us determine whether this is a bug in the wheel, a bug in pip, or a bug in the spec.
In your approach, users do 'pip install foo', and then pip installs the wheel, and then when they try to use the wheel they get an error message from the dynamic linker about missing libraries, and then the user has to read the docs or squint at these error messages in order to figure out what set of apt-get / yum / pacman / ... commands they need to run in order to make foo work. (And possibly there is no such combination of commands that will actually work, because e.g. the wheel was linked against Debian's version of libbar.so.7 and Fedora's version of libbar.so.7 turns out to have an incompatible ABI, or Fedora simply doesn't provide a libbar.so.7 package at all.)
pip could be made to check the wheel for missing library dependencies in order to provide help with cases where additional packages are needed, but overall, yes, that's the way it should work, IMO. It's better to have wheels than not to have them, since installing an additional system package is by far easier than trying to compile packages from source (this will usually also require additional -dev packages to be installed).
* no lock-out of package authors who would like to push wheel files for their packages to PyPI, but happen to use libraries not in the predefined list of the original draft PEP
https://mail.python.org/pipermail/distutils-sig/2016-January/028050.html
Embedding additional libraries in the wheels files to overcome deficiencies in the PEP design simply doesn't feel right to me. People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs. IMO, that's much worse than having to install additional system packages to make a Python wheel work. The embedding approach also creates licensing problems, since those libs may be under different licenses than the package itself. And of course, it increases the size of the wheel files, causing more bandwidth to be necessary, more disk space to be used for wheel caches, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 22 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Jan 22, 2016, at 5:48 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Embedding additional libraries in the wheels files to overcome deficiencies in the PEP design simply doesn't feel right to me.
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
IMO, that's much worse than having to install additional system packages to make a Python wheel work.
The embedding approach also creates licensing problems, since those libs may be under different licenses than the package itself. And of course, it increases the size of the wheel files, causing more bandwidth to be necessary, more disk space to be used for wheel caches, etc.
I think there are a few things here, but this is not my area of expertise so I could be wrong. As I understand it, The manylinux platform definition is largely going to be a documentation effort and there isn't going to be much in the way of enforcement. That means that people who build wheels against the manylinux platform tag are free to really do whatever they want even if it doesn't strictly match the definition of the manylinux platform. The difference here is that if you link against something that isn't included in the set of libraries, and that subsequently breaks due to an ABI incompatability, that's not a pip bug or a manylinux bug, that's a packaging bug with that particular library and they'll have to decide how they want to resolve it (if they want to resolve it). So you'll be free to link to anything you want, but you get to keep both pieces if it breaks and it's outside this defined set of libraries. I also agree that it's OK for users to have to ``apt-get`` (or whatever) a particular library to use something and we don't have to *only* rely on items that are installed as part of a "standard" linux base system. However, what is not OK (IMO) is for the PEP to bless something that has a high chance of ending up with ABI issues rather than "need to apt-get install" issues. For instance, even if you compile against a sufficiently old copy of OpenSSL, OpenSSL (to my understanding) does not have a stable ABI and you cannot take something compiled against OpenSSL on CentOS 5.reallyold and expect it to work on say Arch Linux. So I think there's an explicit list of packages that we know will generally work as long as you build against a sufficiently old copy of them and outside of that it's really a big unknown in general if a particular library can be used in this way or not. We obviously can't enumerate the list of every possible C library that has a stable ABI that can sanely be used cross distro but I think it's reasonable to list some sort of base minimum here, and if people experiment with stepping outside the standard list and can come to us and show "hey, I tried it with xyz library, we've gotten X installs and no complaints" we can then possibly expand the definition of the manylinux platform to include that library and move that project from depending on undefined behavior to defined behavior. Thinking of it in terms of a C like "undefined behavior" is probably a reasonable way of doing it. Linking against a system provided library that is on this list is a defined behavior of the manylinux "platform", linking against something else is undefined and may or may not work. At some level, once you've gotten to the point you're using pip to manage some set of your packages it doesn't really matter if that set of things you're pulling from PyPI includes a C library or not. If you're relying on say psycopg2 it's not clear to me that libpq *needs* to be getting security any more than psycopg2 itself does and so you'll need some method of solving that problem for your Python level dependencies anyways. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On 22 January 2016 at 21:25, Donald Stufft <donald@stufft.io> wrote:
Thinking of it in terms of a C like "undefined behavior" is probably a reasonable way of doing it. Linking against a system provided library that is on this list is a defined behavior of the manylinux "platform", linking against something else is undefined and may or may not work. At some level, once you've gotten to the point you're using pip to manage some set of your packages it doesn't really matter if that set of things you're pulling from PyPI includes a C library or not. If you're relying on say psycopg2 it's not clear to me that libpq *needs* to be getting security any more than psycopg2 itself does and so you'll need some method of solving that problem for your Python level dependencies anyways.
It also wouldn't surprise me if CVE trackers like requires.io and versioneye.com gained the ability to search wheel files for embedded dependencies and flag outdated and vulnerable ones. However, it's a good point that PyPI won't be running auditwheel to *force* compliance with the "no external dependencies outside the defined set" guideline - while a service like pythonwheels.com could potentially be set up independently of PyPI to run auditwheel on manylinux wheels, PyPI itself wouldn't do it. An external scan like that could actually be a useful way of defining manylinux2 in the future - scanning popular manylinux wheel downloads for both embedded libraries and for external dependencies outside the defined set. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 22.01.2016 12:25, Donald Stufft wrote:
On Jan 22, 2016, at 5:48 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Embedding additional libraries in the wheels files to overcome deficiencies in the PEP design simply doesn't feel right to me.
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
IMO, that's much worse than having to install additional system packages to make a Python wheel work.
The embedding approach also creates licensing problems, since those libs may be under different licenses than the package itself. And of course, it increases the size of the wheel files, causing more bandwidth to be necessary, more disk space to be used for wheel caches, etc.
I think there are a few things here, but this is not my area of expertise so I could be wrong. As I understand it, The manylinux platform definition is largely going to be a documentation effort and there isn't going to be much in the way of enforcement. That means that people who build wheels against the manylinux platform tag are free to really do whatever they want even if it doesn't strictly match the definition of the manylinux platform. The difference here is that if you link against something that isn't included in the set of libraries, and that subsequently breaks due to an ABI incompatability, that's not a pip bug or a manylinux bug, that's a packaging bug with that particular library and they'll have to decide how they want to resolve it (if they want to resolve it). So you'll be free to link to anything you want, but you get to keep both pieces if it breaks and it's outside this defined set of libraries.
Hmm, if that were the reading, things would look a lot brighter, but if PyPI will start to only support uploading manylinux wheels for Linux platforms, you essentially have the effect that the PEP ends up defining the set of allowed external libraries and forces package authors to embed any other external libraries into the wheel file - or not be able to upload wheel files for Linux at all. This can hardly be in the interest of Python users who don't want to use wheel embedded system libraries on their Linux system and most likely also don't expect wheel files to ship alternative versions with them in the first place. If we'd lift the ban of "linux_*" tagged wheels on PyPI at the same time we allow "manylinux" wheels, that'd remove a lot of my concerns. In that case, I'd just like to see a way to tell pip not to install manylinux wheels with embedded system libraries, or simply outright reject embedded system libraries in manylinux wheel files.
I also agree that it's OK for users to have to ``apt-get`` (or whatever) a particular library to use something and we don't have to *only* rely on items that are installed as part of a "standard" linux base system. However, what is not OK (IMO) is for the PEP to bless something that has a high chance of ending up with ABI issues rather than "need to apt-get install" issues. For instance, even if you compile against a sufficiently old copy of OpenSSL, OpenSSL (to my understanding) does not have a stable ABI and you cannot take something compiled against OpenSSL on CentOS 5.reallyold and expect it to work on say Arch Linux.
True. There will always be incompatibilities out there which cannot be addressed with a one-fits-all approach. For those cases, vendor specific wheels would need to be created.
So I think there's an explicit list of packages that we know will generally work as long as you build against a sufficiently old copy of them and outside of that it's really a big unknown in general if a particular library can be used in this way or not. We obviously can't enumerate the list of every possible C library that has a stable ABI that can sanely be used cross distro but I think it's reasonable to list some sort of base minimum here, and if people experiment with stepping outside the standard list and can come to us and show "hey, I tried it with xyz library, we've gotten X installs and no complaints" we can then possibly expand the definition of the manylinux platform to include that library and move that project from depending on undefined behavior to defined behavior.
Thinking of it in terms of a C like "undefined behavior" is probably a reasonable way of doing it. Linking against a system provided library that is on this list is a defined behavior of the manylinux "platform", linking against something else is undefined and may or may not work. At some level, once you've gotten to the point you're using pip to manage some set of your packages it doesn't really matter if that set of things you're pulling from PyPI includes a C library or not. If you're relying on say psycopg2 it's not clear to me that libpq *needs* to be getting security any more than psycopg2 itself does and so you'll need some method of solving that problem for your Python level dependencies anyways.
You need both: solving issues at the Python level and at the system level. However, system vendors will often be a lot faster with updates than package authors, simply because it's their business model, so as user you will want to benefit from those updates and not have to rely on the package author to ship new wheel files. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 22 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 22 January 2016 at 22:07, M.-A. Lemburg <mal@egenix.com> wrote:
However, system vendors will often be a lot faster with updates than package authors, simply because it's their business model, so as user you will want to benefit from those updates and not have to rely on the package author to ship new wheel files.
This is true for the subset of packages monitored by distro security response teams, but there's a *lot* of software not currently packaged for Linux distros that never will be as more attention is given to the "rebuild the world on demand" model that elastic cloud computing and fast internet connections enable. My fundamental concern is that if a package author publishes a distro dependent wheel file, pip attempts to install it, and it doesn't work, the reaction for many users is going to be "Python packaging is broken", not "the packaging of this particular package is broken". However, moving the "generic linux wheels are ignored by default" behaviour to pip-the-client, rather than enforcing it as a restriction on PyPI uploads could definitely be a reasonable alternative way of addressing that concern. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 23.01.2016 04:26, Nick Coghlan wrote:
On 22 January 2016 at 22:07, M.-A. Lemburg <mal@egenix.com> wrote:
However, system vendors will often be a lot faster with updates than package authors, simply because it's their business model, so as user you will want to benefit from those updates and not have to rely on the package author to ship new wheel files.
This is true for the subset of packages monitored by distro security response teams, but there's a *lot* of software not currently packaged for Linux distros that never will be as more attention is given to the "rebuild the world on demand" model that elastic cloud computing and fast internet connections enable.
My fundamental concern is that if a package author publishes a distro dependent wheel file, pip attempts to install it, and it doesn't work, the reaction for many users is going to be "Python packaging is broken", not "the packaging of this particular package is broken".
I think helping the user to identify where the problem originates is certainly possible. This can be done in a generic way by e.g. having pip or the wheel package scan the wheel file for shared libraries using ldd and finding potentially missing libs. Or we could define a special package post install entry point which pip calls to have the wheel itself check the system it was installed on for missing system packages or any other post install actions that need to take place before the wheel file can be used, e.g. setting up initial config files.
However, moving the "generic linux wheels are ignored by default" behaviour to pip-the-client, rather than enforcing it as a restriction on PyPI uploads could definitely be a reasonable alternative way of addressing that concern.
I don't think that's the right strategy. There are certainly ways to improve error reporting for Python packaging (see above), but outright rejecting generic wheels is not a good approach, IMO. The wheel system is not yet complete, but until it is, using a "we want to protect the user from failing wheels" approach is not going to help much, since we're just replacing this with a "we'll let the user handle failing source installations" approach instead - with the main reason apparently being that we want to avoid putting the user blame for failures on PyPI, pip or wheels. This ignores that fact that generic wheels have a much better chance of success than source installations of the same package on the same system (it's rather unlikely that the user will have the foo-dev package installed with the corresponding foo binary package). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 23 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Fri, Jan 22, 2016 at 10:26 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 January 2016 at 22:07, M.-A. Lemburg <mal@egenix.com> wrote:
However, system vendors will often be a lot faster with updates than package authors, simply because it's their business model, so as user you will want to benefit from those updates and not have to rely on the package author to ship new wheel files.
This is true for the subset of packages monitored by distro security response teams, but there's a *lot* of software not currently packaged for Linux distros that never will be as more attention is given to the "rebuild the world on demand" model that elastic cloud computing and fast internet connections enable.
My fundamental concern is that if a package author publishes a distro dependent wheel file, pip attempts to install it, and it doesn't work, the reaction for many users is going to be "Python packaging is broken", not "the packaging of this particular package is broken".
This is already broken with source dists if you don't have the appropriate -dev packages (or a compiler) installed. Some package authors provide more useful feedback explaining what the problem is and how one might resolve it, rather than dying on a compiler error due to a missing header, but many do not. One solution to this for both source and binary distributions is package manager awareness in the build/install tools, and to have packages declare their dependencies in structured metadata. A translational layer would make this easier on package authors: If they only had to say they depend on "OpenSSL headers" and that was translated to the correct package for the OS, this could be relayed to the user at build time ("install these packages using this command") or the package manager could be directly invoked, if the user has chosen to allow the build/install tool to do that. --nate
However, moving the "generic linux wheels are ignored by default" behaviour to pip-the-client, rather than enforcing it as a restriction on PyPI uploads could definitely be a reasonable alternative way of addressing that concern.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On 22 January 2016 at 20:48, M.-A. Lemburg <mal@egenix.com> wrote:
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
For the time being, these users should either pass the "--no-binary" option to pip, ask their distro to provide an index of pre-built wheel files for that distro (once we have the distro-specific wheel tagging PEP sorted out), or else ask their distro to update system Python packages in a more timely fashion (or all of the above).
IMO, that's much worse than having to install additional system packages to make a Python wheel work.
The embedding approach also creates licensing problems, since those libs may be under different licenses than the package itself. And of course, it increases the size of the wheel files, causing more bandwidth to be necessary, more disk space to be used for wheel caches, etc.
From the point of view of future-proofing PEP 513 against having such an alternative available in the future, the main question that would need to be considered is how tools would decide download priority between a distro-specific wheel, an integrated linux wheel, and a
Then *don't publish manylinux wheel files*. Manylinux is, by design, a platform+publisher-silo model, very similar to the way smart phone operating systems work, and the way Windows applications and (I believe) Mac App Bundles work. It is anti-thetical to the traditional tightly coupled shared everything model adopted by Linux distributions (where all the binaries are also generally built by a common central build system). There is a different model, which could be tagged as (for example) "integratedlinux1", which is the one you propose. That wouldn't be viable from a UX perspective without an external dependency description system like the one Tennessee proposed in https://github.com/pypa/interoperability-peps/pull/30, but that approach requires a lot more development work before it could be adopted. linux wheel with bundled dependencies. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 22, 2016, at 6:29 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 January 2016 at 20:48, M.-A. Lemburg <mal@egenix.com> wrote:
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
For the time being, these users should either pass the "--no-binary" option to pip, ask their distro to provide an index of pre-built wheel files for that distro (once we have the distro-specific wheel tagging PEP sorted out), or else ask their distro to update system Python packages in a more timely fashion (or all of the above).
I should note, once we have some solution to the fact that "linux, 64bit" is way too generic of a platform tag for the general case I plan to allow the current super generic platform tags to be uploaded as well to PyPI. We don't try to prevent you from ever being able to release a broken package, we just want to make it reasonable that you can do the right thing. In other words, as long as the tooling makes it possible to do the right thing, the fact that you can also generate packaging bugs in your project (as opposed to pip doing the wrong thing) isn't a problem. So if people want to do something that isn't manylinux and don't want to claim to be a manylinux wheel, they'll be free to use the current linux tags as well.
IMO, that's much worse than having to install additional system packages to make a Python wheel work.
The embedding approach also creates licensing problems, since those libs may be under different licenses than the package itself. And of course, it increases the size of the wheel files, causing more bandwidth to be necessary, more disk space to be used for wheel caches, etc.
Then *don't publish manylinux wheel files*. Manylinux is, by design, a platform+publisher-silo model, very similar to the way smart phone operating systems work, and the way Windows applications and (I believe) Mac App Bundles work. It is anti-thetical to the traditional tightly coupled shared everything model adopted by Linux distributions (where all the binaries are also generally built by a common central build system).
There is a different model, which could be tagged as (for example) "integratedlinux1", which is the one you propose. That wouldn't be viable from a UX perspective without an external dependency description system like the one Tennessee proposed in https://github.com/pypa/interoperability-peps/pull/30, but that approach requires a lot more development work before it could be adopted.
From the point of view of future-proofing PEP 513 against having such an alternative available in the future, the main question that would need to be considered is how tools would decide download priority between a distro-specific wheel, an integrated linux wheel, and a linux wheel with bundled dependencies.
I think that this could probably just be left up to the individual tools? We already have to decide between multiple candidate wheels, this is just another thing to look at.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On 22 January 2016 at 11:55, Donald Stufft <donald@stufft.io> wrote:
From the point of view of future-proofing PEP 513 against having such an alternative available in the future, the main question that would need to be considered is how tools would decide download priority between a distro-specific wheel, an integrated linux wheel, and a linux wheel with bundled dependencies.
I think that this could probably just be left up to the individual tools? We already have to decide between multiple candidate wheels, this is just another thing to look at.
The compatibility tag spec (PEP 425) does note that choosing between candidate implementations is a tool-specific implementation detail (at https://www.python.org/dev/peps/pep-0425/#id14). It seems to me that this simply comes under the point of "It is recommended that installers try to choose the most feature complete built distribution available (the one most specific to the installation environment)..." At the moment, the process is effective but rudimentary (we don't really expect more than one compatible wheel that isn't pure-python). If people start seeing multiple potential wheels and want to have the option to control the priorities, it's probably better to develop a solution via pip feature requests, and when things stabilise, document the solution in a standard if it's appropriate (i.e., it's *not* just pip configuration options). Paul

On Fri, Jan 22, 2016 at 6:29 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 January 2016 at 20:48, M.-A. Lemburg <mal@egenix.com> wrote:
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
For the time being, these users should either pass the "--no-binary" option to pip, ask their distro to provide an index of pre-built wheel files for that distro (once we have the distro-specific wheel tagging PEP sorted out), or else ask their distro to update system Python packages in a more timely fashion (or all of the above).
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it. --nate

On Fri, Jan 29, 2016 at 11:35 AM, Nate Coraor <nate@bx.psu.edu> wrote:
On Fri, Jan 22, 2016 at 6:29 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 January 2016 at 20:48, M.-A. Lemburg <mal@egenix.com> wrote:
People who rely on Linux distributions want to continue to do so and get regular updates for system packages from their system vendor. Having wheel files override these system packages by including libs directly in the wheel silently breaks this expectation, potentially opening up the installations for security holes, difficult to track bugs and possible version conflicts with already loaded versions of the shared libs.
For the time being, these users should either pass the "--no-binary" option to pip, ask their distro to provide an index of pre-built wheel files for that distro (once we have the distro-specific wheel tagging PEP sorted out), or else ask their distro to update system Python packages in a more timely fashion (or all of the above).
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it.
It's all yours, I think :-). Some unsolicited advice that you can take or leave...: I think there are two separable questions that are easy to conflate here: (1) the use of a distro tag to specify an ABI ("when I say libssl.so.1, I mean one that exports the same symbols and semantics as the one that Fedora shipped"), and (2) the use of a distro tag for packages that want to depend on more system-supplied libraries and let the distro worry about updating them. I also think there are two compelling use cases for these: (a) folks who would be happy with manylinux, except for whatever reason they can't use it, e.g. because they're on ARM. I bet a platform tag for Raspbian wheels would be quite popular, even if it still required people to vendor dependencies. (b) folks who really want integration between pip and the distro package manager. So my suggestion would be to start with one PEP that just tries to define distro-specific platform tags to answer question (1) and targeting use case (a), and then a second PEP that adds new metadata for specifying external system requirements to answer question (1) and target use case (b). The advantage of doing things in this order is that once you have a platform tag saying "this is a Debian wheel" or "this is a Fedora wheel" scoping you to a particular distribution, then your external package metadata can say "I need the package 'libssl1.0.2' version 1.0.2e-1 or better" or "I need the package 'openssl' version 1.0.2e-5.fc24 or better", and avoid the tarpit of trying to define some cross-distro standard for package naming. -n -- Nathaniel J. Smith -- https://vorpus.org

On Jan 29, 2016, at 2:35 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it.
I think this a great idea, and I think it actually pairs nicely with the manylinux proposal. It should be pretty easy to cover the vast bulk of users with a handful of platform specific wheels (1-3ish) and then a manylinux wheel to cover the rest. It would let a project use newer toolchains/libraries in the common case, but still fall back to the older ones on more unusual platforms. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 29, 2016, at 8:44 PM, Donald Stufft <donald@stufft.io> wrote:
On Jan 29, 2016, at 2:35 PM, Nate Coraor <nate@bx.psu.edu <mailto:nate@bx.psu.edu>> wrote:
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it.
I think this a great idea, and I think it actually pairs nicely with the manylinux proposal. It should be pretty easy to cover the vast bulk of users with a handful of platform specific wheels (1-3ish) and then a manylinux wheel to cover the rest. It would let a project use newer toolchains/libraries in the common case, but still fall back to the older ones on more unusual platforms.
Yes! This would be fantastic. There are some libraries you actually want to dynamically link against from the platform, especially if you're writing desktop apps. On OS X you can do this because /System/*/ is more or less fixed when you are >= some version; on linux less so but it would be very nice to build artifacts for specific versions when possible. -glyph

On Fri, Jan 29, 2016 at 11:44 PM, Donald Stufft <donald@stufft.io> wrote:
On Jan 29, 2016, at 2:35 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it.
I think this a great idea, and I think it actually pairs nicely with the manylinux proposal. It should be pretty easy to cover the vast bulk of users with a handful of platform specific wheels (1-3ish) and then a manylinux wheel to cover the rest. It would let a project use newer toolchains/libraries in the common case, but still fall back to the older ones on more unusual platforms.
Fantastic, this is exactly the sort of usage I was hoping to see. I'll move forward with it, then. --nate
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On 30 January 2016 at 05:35, Nate Coraor <nate@bx.psu.edu> wrote:
On Fri, Jan 22, 2016 at 6:29 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
For the time being, these users should either pass the "--no-binary" option to pip, ask their distro to provide an index of pre-built wheel files for that distro (once we have the distro-specific wheel tagging PEP sorted out), or else ask their distro to update system Python packages in a more timely fashion (or all of the above).
Is there a distro-specific wheel tagging PEP in development somewhere that I missed? If not, I will get the ball rolling on it.
Yeah, the "we" there was actually "Nate", since you recently mentioned having made progress on this for Galaxy :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 22 January 2016 at 19:33, M.-A. Lemburg <mal@egenix.com> wrote:
For example, if a package needs a specific version of libpng, the package author can document this and the user can then make sure to install that particular version.
The assumption that any given Python user will know how to do this is not a reasonable assumption in 2016. If a publisher wants to bundle a particular version of libpng, they can. If (as is also entirely reasonable) they don't want to assume the associated responsibilities for responding to CVEs, then they can stick with source distributions, or target more specific Linux versions (as previously discussed in the context of Nate Coraor's Starforge work) Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Jan 22, 2016 at 5:42 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 January 2016 at 19:33, M.-A. Lemburg <mal@egenix.com> wrote:
For example, if a package needs a specific version of libpng, the package author can document this and the user can then make sure to install that particular version.
The assumption that any given Python user will know how to do this is not a reasonable assumption in 2016.
If a publisher wants to bundle a particular version of libpng, they can. If (as is also entirely reasonable) they don't want to assume the associated responsibilities for responding to CVEs, then they can stick with source distributions, or target more specific Linux versions (as previously discussed in the context of Nate Coraor's Starforge work)
I wonder if, in relation to this, it may be best to have two separate tags: one to indicate that the wheel includes external libraries rolled in to it, and one to indicate that it doesn't. That way, a user can make a conscious decision as to whether they want to install any wheels that could include libraries that won't be maintained by the distribution package manager. That way if we end up in a future world where manylinux wheels and distro-specific wheels (that may depend on non-default distro packages) live in PyPI together, there'd be a way to indicate a preference. --nate
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On 30 January 2016 at 05:30, Nate Coraor <nate@bx.psu.edu> wrote:
I wonder if, in relation to this, it may be best to have two separate tags: one to indicate that the wheel includes external libraries rolled in to it, and one to indicate that it doesn't. That way, a user can make a conscious decision as to whether they want to install any wheels that could include libraries that won't be maintained by the distribution package manager. That way if we end up in a future world where manylinux wheels and distro-specific wheels (that may depend on non-default distro packages) live in PyPI together, there'd be a way to indicate a preference.
I don't think we want to go into that level of detail in the platform tag, but metadata for bundled pre-built binaries in wheels and vendored dependencies in sdists is worth considering as an enhancement in its own right. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Jan 29, 2016 at 9:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I wonder if, in relation to this, it may be best to have two separate tags: one to indicate that the wheel includes external libraries rolled in to it, and one to indicate that it doesn't. That way, a user can make a conscious decision as to whether they want to install any wheels that could include libraries that won't be maintained by the distribution package manager. That way if we end up in a future world where manylinux wheels and distro-specific wheels (that may depend on non-default distro packages)
On 30 January 2016 at 05:30, Nate Coraor <nate@bx.psu.edu> wrote: live
in PyPI together, there'd be a way to indicate a preference.
I don't think we want to go into that level of detail in the platform tag, but metadata for bundled pre-built binaries in wheels and vendored dependencies in sdists is worth considering as an enhancement in its own right.
I thought the same thing - the only reason I proposed tags that it was my understanding that such metadata is not available to installation tool(s) until the distribution is fetched and inspected. If my limited understanding is incorrect then I agree that having this in the tags is too much. --nate
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 21, 2016, at 4:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If Donald can provide the list of "most downloaded wheel files" for other platforms, that could also be a useful guide as to how many source builds may potentially already be avoided through the draft "manylinux1" definition.
SELECT COUNT(*) AS downloads, file.filename FROM TABLE_DATE_RANGE( [long-stack-762:pypi.downloads], TIMESTAMP("20160114"), CURRENT_TIMESTAMP() ) WHERE file.type = 'bdist_wheel' AND NOT REGEXP_MATCH(file.filename, '^.*-none-any.whl$') GROUP BY file.filename ORDER BY downloads DESC LIMIT 1000 https://gist.github.com/dstufft/1dda9a9f87ee7121e0ee ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 21, 2016, at 12:18 PM, Donald Stufft <donald@stufft.io> wrote:
On Jan 21, 2016, at 4:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If Donald can provide the list of "most downloaded wheel files" for other platforms, that could also be a useful guide as to how many source builds may potentially already be avoided through the draft "manylinux1" definition.
Or https://gist.github.com/dstufft/ea8a95580b022b233635 if you prefer it grouped by project. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Thu, Jan 21, 2016 at 9:23 AM, Donald Stufft <donald@stufft.io> wrote:
On Jan 21, 2016, at 4:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: If Donald can provide the list of "most downloaded wheel files" for other platforms, that could also be a useful guide as to how many source builds may potentially already be avoided through the draft "manylinux1" definition.
Or https://gist.github.com/dstufft/ea8a95580b022b233635 if you prefer it grouped by project.
I went through this list and compiled manylinux1 wheels for each of the top 15 projects in the list (py35). The wheels are here, if you're interested http://stanford.edu/~rmcgibbo/wheelhouse/. The amount of work was pretty small -- the complete dockerfile for this, building off from the quay.io/manylinux/manylinux image mentioned in the pep draft is here: https://github.com/rmcgibbo/manylinux/blob/popular/build-popular/Dockerfile -Robert

I went through this list and compiled manylinux1 wheels for each of the top 15 projects in the list (py35). The wheels are here, if you're interested http://stanford.edu/~rmcgibbo/wheelhouse Cool! Are the non-manylinux dependencies all statically linked? -CHB

On Thu, Jan 21, 2016 at 1:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I think it's better to start with a small core that we *know* works, then expand later, rather than trying to make the first iteration too wide. The "manylinux1" tag itself is versioned (hence the "1" at the end), so "manylinux2" may simply have *more* libraries defined, rather than newer ones.
We wouldn't even necessarily need to bump the version number, since the main thing the "manylinux1" tag does is serve as a notice to pip about whether a wheel is likely to work on a given system. If after experience we realize that in fact there are more libraries that would work on those same systems, then we can release a "manylinux 1.1" document that updates the guidance to package authors but continues to use the "manylinux1" tag and leaves the same code in pip. (And ditto if we realize that there are libraries on the list that shouldn't be.) -n -- Nathaniel J. Smith -- https://vorpus.org

Hi Nathaniel and Robert, This is a really nice proposal. I would only like to see added to this proposal (or another one) that new versions of installers should offer an option to opt in or out of binary packages from specific sources. Right now pip has the switches: - --no-binary <format_control> - --only-binary <format_control> These switches (dis)allow downloading binaries for specific packages. I think it would be nice to have the following options as well: - --no-binary-from <origin_control> - --only-binary-from <origin_control> This way I could block binaries from PyPI but allow from my private index where they were compiled with my favorite compiler or on the closest platform possible to my production, or with my favorite compilation flags present in the environment. As Robert quoted, the "speed and correctness implications to always reverting to the lowest common denominator of glibc." will result in some deployments preferring to set up their own indexes or "find-links" repositories for storing optimized binary packages while allowing PyPI to provide the rest. Alternatively, the `<format_control>` specification above could be expanded to allow not only specifying package names, but package names with their respective sources. Regards, Leo

On 21 January 2016 at 23:30, Leonardo Rochael Almeida <leorochael@gmail.com> wrote:
I think it would be nice to have the following options as well:
--no-binary-from <origin_control> --only-binary-from <origin_control>
This way I could block binaries from PyPI but allow from my private index where they were compiled with my favorite compiler or on the closest platform possible to my production, or with my favorite compilation flags present in the environment.
I quite like that idea, but I don't think it needs to be linked to the PEP - it can just be an enhancement request for pip (It applies just as much to other platforms as it does Linux, and even for Linux, private indices can already be used to host prebuilt Linux binaries). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 20, 2016, at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
Forgive my, probably stupid, question… but why these libraries? Are these libraries unable to be reasonably statically linked like glibc is? ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 21, 2016 9:32 AM, "Donald Stufft" <donald@stufft.io> wrote:
On Jan 20, 2016, at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
Forgive my, probably stupid, question… but why these libraries? Are these libraries unable to be reasonably statically linked like glibc is?
It's not a stupid question at all :-). What we want is a list of libraries that are present, and compatible, and relevant, across the Linux distributions that most people use in practice. Unfortunately, there isn't really any formal specification or published data on this -- the only way to make such a list is to make a guess, start distributing software to lots of people based on your guess, wait for bug reports to come in, edit your list to avoid the problematic libraries, add some more libraries to your list to cover the new packages you've added to your distribution and where you think the risk is a worthwhile trade off versus just static linking, then repeat until you stop getting bug reports. This is expensive and time consuming and requires data we don't have, so we delegated: the list in the PEP is exactly the set of libraries that Continuum's Anaconda distribution links to and a subset of the libraries that Enthought's Canopy links to [1]. They've both been going through the loop above for years, with lots of packages, and lots of lots of downloads, so we're piggybacking off their effort. Here's the list of packages in Anaconda: http://docs.continuum.io/anaconda/pkg-docs -n [1] why only a subset of Canopy's libraries? because when we analyzed the latest release of Canopy we found several weird libraries that we knew must be mistakes and being pulled in by code paths that were never used in practice, so we decided to be conservative about trusting their library list. We're in contact with their distribution folks and might add some more libraries to this list if they come back to us and assure us that those libraries have actually been tested.

On Jan 21, 2016, at 1:43 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Jan 21, 2016 9:32 AM, "Donald Stufft" <donald@stufft.io <mailto:donald@stufft.io>> wrote:
On Jan 20, 2016, at 10:55 PM, Nathaniel Smith <njs@pobox.com <mailto:njs@pobox.com>> wrote:
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
Forgive my, probably stupid, question… but why these libraries? Are these libraries unable to be reasonably statically linked like glibc is?
It's not a stupid question at all :-).
What we want is a list of libraries that are present, and compatible, and relevant, across the Linux distributions that most people use in practice.
Unfortunately, there isn't really any formal specification or published data on this -- the only way to make such a list is to make a guess, start distributing software to lots of people based on your guess, wait for bug reports to come in, edit your list to avoid the problematic libraries, add some more libraries to your list to cover the new packages you've added to your distribution and where you think the risk is a worthwhile trade off versus just static linking, then repeat until you stop getting bug reports.
This is expensive and time consuming and requires data we don't have, so we delegated: the list in the PEP is exactly the set of libraries that Continuum's Anaconda distribution links to and a subset of the libraries that Enthought's Canopy links to [1]. They've both been going through the loop above for years, with lots of packages, and lots of lots of downloads, so we're piggybacking off their effort.
Here's the list of packages in Anaconda: http://docs.continuum.io/anaconda/pkg-docs <http://docs.continuum.io/anaconda/pkg-docs>
I guess my underlying question is, if we’re considering static linking (or shipping the .so dll style) to be good enough for everything not on this list, why are these specific packages on the list? Why are we not selecting the absolute bare minimum packages that you *cannot* reasonably static link or ship the .so? ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Thu, Jan 21, 2016 at 12:08 PM, Donald Stufft <donald@stufft.io> wrote:
I guess my underlying question is, if we’re considering static linking (or shipping the .so dll style) to be good enough for everything not on this list, why are these specific packages on the list? Why are we not selecting the absolute bare minimum packages that you *cannot* reasonably static link or ship the .so?
This is a fair question. The principle, practical reason is that we followed the lead of what other projects have done here for distributing cross-distro binaries, especially Anaconda and Canopy. I also just looked at the external libraries required by the portable firefox Linux binaries ( https://www.mozilla.org/en-US/firefox/43.0.4/system-requirements/). The additional shared libraries that the firefox pre-compiled binaries require that are not included in our list are libXcomposite.so.1 libXdamage.so.1 libXfixes.so.3 libXt.so.6 libasound.so.2 libatk-1.0.so.0 libcairo.so.2 libdbus-1.so.3 libdbus-glib-1.so.2 libfontconfig.so.1 libfreetype.so.6 libgdk-x11-2.0.so.0 libgdk_pixbuf-2.0.so.0 libgio-2.0.so.0 libgmodule-2.0.so.0 libgtk-x11-2.0.so.0 libpango-1.0.so.0 libpangocairo-1.0.so.0 libpangoft2-1.0.so.0 I would be open to including some of these libraries in the manylinux1 policy, or in a subsequent update (manylinux2, etc). -Robert

On Jan 21, 2016 12:08 PM, "Donald Stufft" <donald@stufft.io> wrote:
On Jan 21, 2016, at 1:43 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Jan 21, 2016 9:32 AM, "Donald Stufft" <donald@stufft.io> wrote:
On Jan 20, 2016, at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
Forgive my, probably stupid, question… but why these libraries? Are
libraries unable to be reasonably statically linked like glibc is?
It's not a stupid question at all :-).
What we want is a list of libraries that are present, and compatible, and relevant, across the Linux distributions that most people use in
Unfortunately, there isn't really any formal specification or published
data on this -- the only way to make such a list is to make a guess, start distributing software to lots of people based on your guess, wait for bug reports to come in, edit your list to avoid the problematic libraries, add some more libraries to your list to cover the new packages you've added to your distribution and where you think the risk is a worthwhile trade off versus just static linking, then repeat until you stop getting bug reports.
This is expensive and time consuming and requires data we don't have, so
we delegated: the list in the PEP is exactly the set of libraries that Continuum's Anaconda distribution links to and a subset of the libraries
Here's the list of packages in Anaconda: http://docs.continuum.io/anaconda/pkg-docs
I guess my underlying question is, if we’re considering static linking (or shipping the .so dll style) to be good enough for everything not on
these practice. that Enthought's Canopy links to [1]. They've both been going through the loop above for years, with lots of packages, and lots of lots of downloads, so we're piggybacking off their effort. this list, why are these specific packages on the list? Why are we not selecting the absolute bare minimum packages that you *cannot* reasonably static link or ship the .so? So, there are tradeoffs here of course. Some libraries are more or less difficult to statically link, and vendoring libraries does have costs in terms of maintenance, download sizes, memory usage, and so forth... it's complicated and varies from library to library. I suspect that even the people who have been dealing with these problems full time for years can't necessarily remember all the details of what happened that made them take particular decisions. (Both Anaconda and Canopy have made the decision to vendor libz, even though you'd think that libz would be a prime candidate for getting from the package manager, being both ubiquitous and historically notable for all the problems that vendoring it has caused in the past. Why do they do this? I don't know, and I haven't shipped millions of Linux binary packages so my guesses probably aren't worth that much :-).) The list probably isn't perfect, but we do know that it will work, which is an advantage that other lists don't have. The other thing is the Mason-Dixon principle: we have to draw this <censored> line somewhere ;-). You're suggesting more conservative, Marc-Andre and Matthais want it to be more liberal, and of course any self respecting Linux geek has a guess about exactly which libraries should be included (myself included). This bikeshed won't paint itself! So rather than open the door to endless petty debate and fiddling, just sticking with a known good list is pretty attractive. -n

nice, idea, but.... libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1 These are all X11, yes? pretty much any workstation will have these, but in general, servers won't. Someone on this thread suggested that that's OK -- don't expect a GUI package to work on a linux box without a GUI. But some of these libs might get used for stuff like back-end rendering, etc, so would be expected to work on a headless box. I think Anaconda an Canopy have gotten away with this because both of their user bases are primarily desktop data analysis type of stuff -- not web services, web servers, etc. Then there is the "additional libs" problem. Again, Anaconda and Canopy can do this because they are providing those libs - lots of stuff you'd generally expect to be there in a typical *nix system stuff like libpng, libjpeg, who knows what they heck else. So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that. Anyway, maybe we just need to try and see how it shakes out.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Hi, On Thu, Jan 21, 2016 at 11:42 AM, Chris Barker <chris.barker@noaa.gov> wrote:
nice, idea, but....
libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1
These are all X11, yes? pretty much any workstation will have these, but in general, servers won't.
Someone on this thread suggested that that's OK -- don't expect a GUI package to work on a linux box without a GUI. But some of these libs might get used for stuff like back-end rendering, etc, so would be expected to work on a headless box. I think Anaconda an Canopy have gotten away with this because both of their user bases are primarily desktop data analysis type of stuff -- not web services, web servers, etc.
Someone administering a headless server will surely be able to cope with that problem, and the trade-off of convenience for the large majority of users seems like a good one to me.
Then there is the "additional libs" problem. Again, Anaconda and Canopy can do this because they are providing those libs - lots of stuff you'd generally expect to be there in a typical *nix system stuff like libpng, libjpeg, who knows what they heck else.
So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
That's exactly how the current OSX Pillow wheels work, and they've been working fine for a while now. There just aren't that many libraries to worry about for the vast majority of packages. Cheers, Matthew

On Thu, Jan 21, 2016 at 11:48 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
That's exactly how the current OSX Pillow wheels work, and they've been working fine for a while now. There just aren't that many libraries to worry about for the vast majority of packages.
well, that was really intended to be only an example. And OS-X provides more basic libs than manylinux anyway (zlib, freetype, either png or jpeg, can't remember which). The library list got long enough to drive me crazy -- I guess you've got more patience than I have. Tried to build any OSGEO stuff lately? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Hi, On Thu, Jan 21, 2016 at 11:57 AM, Chris Barker <chris.barker@noaa.gov> wrote:
On Thu, Jan 21, 2016 at 11:48 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
That's exactly how the current OSX Pillow wheels work, and they've been working fine for a while now. There just aren't that many libraries to worry about for the vast majority of packages.
well, that was really intended to be only an example. And OS-X provides more basic libs than manylinux anyway (zlib, freetype, either png or jpeg, can't remember which).
The library list got long enough to drive me crazy -- I guess you've got more patience than I have. Tried to build any OSGEO stuff lately?
I'm sure there are packages that would be hard to build, but for the vast majority of packages, getting a build recipe is a one-time only job which might (in bad cases) take a day or so, and then can be maintained from time to time by the package maintainer. The Pillow wheel builder is a good example - I built the prototype, but the Pillow guys own and maintain it now. I don't think it's sensible to veto all linux wheels because there are some packages that will be hard to build. Cheers, Matthew

On Jan 21, 2016 11:43 AM, "Chris Barker" <chris.barker@noaa.gov> wrote:
nice, idea, but....
libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1
These are all X11, yes? pretty much any workstation will have these, but
in general, servers won't.
Someone on this thread suggested that that's OK -- don't expect a GUI
package to work on a linux box without a GUI. But some of these libs might get used for stuff like back-end rendering, etc, so would be expected to work on a headless box. I think Anaconda an Canopy have gotten away with this because both of their user bases are primarily desktop data analysis type of stuff -- not web services, web servers, etc. I can only speak for myself and my team, but we use Anaconda on servers on a daily basis, including with libraries like matplotlib to generate images that are displayed over a web service. I believe this is a pretty common use case, especially with popular apps like Jupyter servers.

On Thu, 21 Jan 2016 11:42:57 -0800 Chris Barker <chris.barker@noaa.gov> wrote:
nice, idea, but....
libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1
These are all X11, yes? pretty much any workstation will have these, but in general, servers won't.
For the record, not having or not running a X11 server doesn't mean you don't have a subset of the corresponding *libraries*. For example, I have a Debian jessie server and, while there's no X11-related executable on the machine, I still have the following libraries installed: $ dpkg -l | grep X [...] ii libpangox-1.0-0:amd64 0.0.2-5 amd64 pango library X backend ii libpixman-1-0:amd64 0.32.6-3 amd64 pixel-manipulation library for X and cairo ii libpod-latex-perl 0.61-1 all module to convert Pod data to formatted LaTeX ii libx11-6:amd64 2:1.6.2-3 amd64 X11 client-side library ii libx11-data 2:1.6.2-3 all X11 client-side library ii libxau6:amd64 1:1.0.8-1 amd64 X11 authorisation library ii libxcb-render0:amd64 1.10-3+b1 amd64 X C Binding, render extension ii libxcb-shm0:amd64 1.10-3+b1 amd64 X C Binding, shm extension ii libxcb1:amd64 1.10-3+b1 amd64 X C Binding ii libxdmcp6:amd64 1:1.1.1-1+b1 amd64 X11 Display Manager Control Protocol library ii libxext6:amd64 2:1.3.3-1 amd64 X11 miscellaneous extension library ii libxft2:amd64 2.3.2-1 amd64 FreeType-based font drawing library for X ii libxml2:amd64 2.9.1+dfsg1-5+deb8u1 amd64 GNOME XML library ii libxpm4:amd64 1:3.5.11-1+b1 amd64 X11 pixmap library ii libxrender1:amd64 1:0.9.8-1+b1 amd64 X Rendering Extension client library ii xkb-data 2.12-1 all X Keyboard Extension (XKB) configuration data [...] Regards Antoine.

On Thu, Jan 21, 2016 at 7:42 PM, Chris Barker <chris.barker@noaa.gov> wrote:
nice, idea, but....
libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1
These are all X11, yes? pretty much any workstation will have these, but in general, servers won't.
Those would be required by GUI packages. People who know how to install headless systems would know how to handle errors because of missing sonames. I can talk from experience that this problem does not happen often. Note that this set of libraries was built from both what anaconda and canopy do
Someone on this thread suggested that that's OK -- don't expect a GUI package to work on a linux box without a GUI. But some of these libs might get used for stuff like back-end rendering, etc, so would be expected to work on a headless box. I think Anaconda an Canopy have gotten away with this because both of their user bases are primarily desktop data analysis type of stuff -- not web services, web servers, etc.
This is not true for us at Enthought, and I would be surprised if it were for anaconda.
So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
We can explain how things work in details for some packages, but the main rationale for the PEP list is that this is a list that works in practice. It has worked well for us at Enthought for many years, and it has for (Ana)conda as well. Between both distributions, we are talking about millions of installs over the year, on many different systems. David

On Thu, Jan 21, 2016 at 10:04 PM, David Cournapeau <cournape@gmail.com> wrote:
On Thu, Jan 21, 2016 at 7:42 PM, Chris Barker <chris.barker@noaa.gov> wrote:
nice, idea, but....
libX11.so.6 libXext.so.6 libXrender.so.1 libGL.so.1
These are all X11, yes? pretty much any workstation will have these, but in general, servers won't.
Those would be required by GUI packages. People who know how to install headless systems would know how to handle errors because of missing sonames.
I would also mention that at Enthought, we have a fairly comprehensive testing policy for our packages (e.g. running the test suite of a package at every build as much as possible). We do the testing under headless environments on every platform as much as possible, and it works for maybe 95 % of the packages (including things like matplotlib, etc...). E.g. on Linux, most packages work well with a framebuffer if you only care about offline rendering or testing. David I can talk from experience that this problem does not happen often.
Note that this set of libraries was built from both what anaconda and canopy do
Someone on this thread suggested that that's OK -- don't expect a GUI package to work on a linux box without a GUI. But some of these libs might get used for stuff like back-end rendering, etc, so would be expected to work on a headless box. I think Anaconda an Canopy have gotten away with this because both of their user bases are primarily desktop data analysis type of stuff -- not web services, web servers, etc.
This is not true for us at Enthought, and I would be surprised if it were for anaconda.
So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
We can explain how things work in details for some packages, but the main rationale for the PEP list is that this is a list that works in practice. It has worked well for us at Enthought for many years, and it has for (Ana)conda as well. Between both distributions, we are talking about millions of installs over the year, on many different systems.
David

So without a plan to provide all that stuff -- I"m not sure of the utility of this -- how are you gong to get PIL/Pillow to work? statically link up the ying-yang? Not sure the linux world will take to that.
We can explain how things work in details for some packages, but the main rationale for the PEP list is that this is a list that works in practice. It has worked well for us at Enthought for many years, and it has for (Ana)conda as well. My point was that it works in those cases because Enthought and Continuum provide a bunch of (often hard to build) packages that provide all the extra stuff. Maybe the community will spring forth and do that -- I'm skeptical because I tried to to that for years for OS-X and it was just too much to do. And the infrastructure was there. Before pip and wheel there were mpkgs on OS-X, and repo's of toms for Linux years before that -- but always the result of a couple people's heroic efforts. Maybe the infrastructure has improved, and the community grown enough, that this will all work. We'll see. CHB Between both distributions, we are talking about millions of installs over the year, on many different systems. David

On Thu, Jan 21, 2016 at 6:47 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
Maybe the infrastructure has improved, and the community grown enough, that this will all work. We'll see.
Yeah, that's my hope too. Currently, the community lacks the permissions to upload Linux wheels to PyPI. Given that, it's no surprise that the community hasn't formed yet. I'm hopeful that in the future, if this PEP is accepted and we can make the tooling and documentation excellent, uploading Linux wheels can start to become a standard part of the PyPI release cycle for package maintainers. -Robert

On Jan 21, 2016, at 10:27 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
I'm hopeful that in the future, if this PEP is accepted and we can make the tooling and documentation excellent, uploading Linux wheels can start to become a standard part of the PyPI release cycle for package maintainers.
Longer term, I really want to (but have put zero effort into trying to plan it out beyond pie in the sky dreams) have it so that maintainers can publish a sdist to PyPI, and have PyPI automatically build wheels for you. This is a long way down the road though. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Thu, Jan 21, 2016 at 7:29 PM, Donald Stufft <donald@stufft.io> wrote:
On Jan 21, 2016, at 10:27 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
I'm hopeful that in the future, if this PEP is accepted and we can make the tooling and documentation excellent, uploading Linux wheels can start to become a standard part of the PyPI release cycle for package maintainers.
Longer term, I really want to (but have put zero effort into trying to plan it out beyond pie in the sky dreams) have it so that maintainers can publish a sdist to PyPI, and have PyPI automatically build wheels for you. This is a long way down the road though.
Yes, absolutely! I think this will actually not be _too_ difficult for Linux (because docker). The challenges for Windows and OS X are more significant. For the open source communities that I'm involved in, Travis-CI has really made everyone much more aware and comfortable with Linux container-based web services that compile our packages, so a PyPI wheel farm seems very much within reach over the next year or so. -Robert

2016-01-22 3:47 GMT+01:00 Chris Barker - NOAA Federal <chris.barker@noaa.gov>:
Maybe the community will spring forth and do that -- I'm skeptical because I tried to to that for years for OS-X and it was just too much to do. And the infrastructure was there.
Before pip and wheel there were mpkgs on OS-X, and repo's of toms for Linux years before that -- but always the result of a couple people's heroic efforts.
Maybe the infrastructure has improved, and the community grown enough, that this will all work. We'll see.
I think the infrastructure has improved. For instance I invested some time and effort to provide a template configuration to build C/C++ compiled extensions for windows wheels on top of the free AppVeyor.com CI platform [1]. Since then this build configuration has been integrated in the python packaging documentation [2] and I had the opportunity to present that work at PyCon [3] (and PyCon FR) last year. Now the community of project maintainers has picked it up. I can count more than 300 projects using this build setup on github. I have very little work to do to help them maintain that infra-strucuture nowadays. Even the configuration upgrade to make it work with MSVC 2015 / Python 3.5 was contributed to my repo before I could find the time to investigate the issue my-self. My point is that once we have clearly defined best-practices for packaging and convenient tools to build the packages automatically and test that they work as expected (e.g. free hosted CI that support running an old centos-based docker container), I am rather confident that the community will do the work. It's mostly a matter of providing good tools and education resources (documentation and example configuration templates). [1] https://github.com/ogrisel/python-appveyor-demo/ [2] https://www.appveyor.com/docs/packaging-artifacts [3] https://www.youtube.com/watch?v=d-p6lJJObLU -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel

OK, I'll try to stop being emotional here :-) 2016-01-22 3:47 GMT+01:00 Chris Barker - NOAA Federal <chris.barker@noaa.gov
:
I'm skeptical because I tried to to that for years for OS-X and it was just too much to do. And the infrastructure was there.
My point is that once we have clearly defined best-practices for packaging and convenient tools to build the packages automatically and test that they work as expected (e.g. free hosted CI that support running an old centos-based docker container), I am rather confident that the community will do the work.
OK -- here is the emotional part -- I worked for years to try to get support to: "clearly defined best-practices for packaging and convenient tools to build the packages automatically" Primarily for OS-X. I got zero support -- nada -- nothing. Really. A handful of people did their own thing to support the community, but no cooperation of standards -- each package built was done with a lot of hand work and each in its own way. So when i found the conda community working on common tools and methods, it was very refreshing. But OK -- maybe times have changed. By the way, I'm in the middle of some build hell with conda -- it's doing some really weird stuff with finding shared libs provided by other conda packages -- so maybe I'm open to looking at wheels again :-) But my concern is not the base libs -- that will get Linux on par with Windows and OS-X. My concern is with third party libs, what's the plan there? 1) each package that needs a third partly lib statically links it in. 2) each package that needs a third partly lib provides it, linked with a realtive path (IIUC, that's how most Windows packages are done. 3) We establish some standard for providing binary libs as wheels, so that other packages can depend on them and link to them. 1) is a pain int he %^# with gcc and linux, which really likes to dynamically link 2) seems to have been made pretty easy with auditwheel -- nice! 3) seems like the "proper" way to go. somehow making everyone figure out how to build and shop those deps, and then bloating the wheels and installations and binaries feels wrong to me. We've been using the example of, say, libpng, in this discussion --that's a good example, because it's pretty commonly used, but not part of all base distributions. but it's also pretty easy to build and pretty small. So let's look at a couple examples I've dealt with: The netcdf / hdf5 stack -- there is a pynetcd4 package, which requires the c netcdf lib, which requires libhdf5, which requires libcurl, zlib, (anda few others, I think) non-trivial to build and ship. Also, there are at least two other commonly used python packages I know of that use hdf5: pytables and h5py. So it would be really nice if all these python packages didn't need to solve the build issues and ship all these libs themselves, resulting in possibly incompatible version all loaded into python all at once. Oh, and as i happens, I've got my obscure python package that uses a bunch of C code that also needs libnetcdf.... Then there is the OpenGIS stack: there is the geos lib and proj4 lib, that most others things are built on. There is the GDAL/OGR lib, that requires those, plus (optionally), a lot of other ugly libs (this is ugly to build, believe me). And GDAL comes with its own python bindings, but there is also shapely, that wraps geos, and pyproj4 that wraps proj4, and fiona that wraps OGR, and .... A big interconnected web. [oh, fiona and shapely also required numpy at the binary level...) I can only imagine that the image processing or video or audio processing communities have similar piles of interconnected packages. (I know I've tried, unsuccessfully, to get FFMPEG to work...) So all this requires, I think, (3) to get anywhere -- is the community ready to support such an effort? And this going to requires some more tooling, too. Somewhere on this thread, someone suggested there may be a videolinuxapi, or some such. Perhaps the better way to to have a core base (such as manylinux), and then a handful of binary lib collections as a single wheel: osgeowheel hdf-wheel audio-wheel image-wheel hmm -- kind of liking that idea, actually. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Jan 23, 2016 at 6:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
1) each package that needs a third partly lib statically links it in. 2) each package that needs a third partly lib provides it, linked with a relative path (IIUC, that's how most Windows packages are done. 3) We establish some standard for providing binary libs as wheels, so that other packages can depend on them and link to them.
In my view, *all* of these are valid options. I think much of this will need to be worked out by the communities -- especially if individual packages and subcommunities decide to take the option (3) approach. I hope this PEP will enable the communities involved in OpenGIS, audio processing, image processing, etc to work out the solutions that work for them and their users. Perhaps one thing that is missing from the PEP is an explicit statement that option (3) is compatible with the manylinux1 tag -- bundling is a valid solution, but it's not the *only* solution. -Robert

On 24 January 2016 at 12:31, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Sat, Jan 23, 2016 at 6:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
1) each package that needs a third partly lib statically links it in. 2) each package that needs a third partly lib provides it, linked with a relative path (IIUC, that's how most Windows packages are done. 3) We establish some standard for providing binary libs as wheels, so that other packages can depend on them and link to them.
In my view, all of these are valid options. I think much of this will need to be worked out by the communities -- especially if individual packages and subcommunities decide to take the option (3) approach. I hope this PEP will enable the communities involved in OpenGIS, audio processing, image processing, etc to work out the solutions that work for them and their users.
Perhaps one thing that is missing from the PEP is an explicit statement that option (3) is compatible with the manylinux1 tag -- bundling is a valid solution, but it's not the *only* solution.
I've long resisted the notion of defining our own cross-distro platform ABI, but the Docker build environment that was put together for the manylinux project has made me realise that doing that may not be as hellish in a post-Docker world as it would have been in a pre-Docker world. (Since we can go with the specification + reference implementation approach that CPython has used so successfully for so long, rather than having to have the build environment and ABI specification be entirely exhaustive). On Windows and Mac OS X, our binary compatibility policies for wheel files are actually pretty loose - it's "be binary compatible with the python.org builds for that platform, including linking against the appropriate C standard library", and that's about it. Upgrades to those ABIs are then driven by CPython switching to newer base compatibility levels (dropping end-of-life versions on the Windows side [1], and updating to new deployment target macros on the Mac OS X side). Folks with external dependencies either bundle them, skip publishing wheel files, or just let them fail at import time if the external dependency is missing. (Neither platform has an anti-bundling culture, though, so I assume a lot of folks go with the first option over the last one) If the aim is to bring Linux wheel support in line with Windows and Mac OS X, then rather than defining a *new* compatibility tag (which would require new pip clients to process), perhaps we could instead adopt a similarly loose policy on what the existing generic "linux" tag means as we have for Windows and Mac OS X: it could just mean wheel files that are binary compatible with the Python binaries in the "manylinux" build environment. The difference would then just be that the target Linux ABI would be defined by PyPA and the manylinux developers, rather than by python-dev. In terms of the concerns regarding the age of gcc needed to target CentOS 5.11, it would be good to know just what nominating CentOS 6.x as the baseline ABI instead would buy us - CentOS 5 is going on 9 years old (released 2007) and stopped receiving full updates back in 2014 [2], while RHEL/CentOS 6 is just over 5 years old and has another year of full updates left. The CentOS 6 ABI should still be old enough to be compatible with the Debian 6 ABI (current stable is Debian 8), as well as the Ubuntu 12.04 LTS ABI (Ubuntu 16.04 LTS is due out in a few months). Cheers, Nick. [1] https://www.python.org/dev/peps/pep-0011/#microsoft-windows [2] https://wiki.centos.org/About/Product -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 24, 2016, at 7:08 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 24 January 2016 at 12:31, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Sat, Jan 23, 2016 at 6:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
1) each package that needs a third partly lib statically links it in. 2) each package that needs a third partly lib provides it, linked with a relative path (IIUC, that's how most Windows packages are done. 3) We establish some standard for providing binary libs as wheels, so that other packages can depend on them and link to them.
In my view, all of these are valid options. I think much of this will need to be worked out by the communities -- especially if individual packages and subcommunities decide to take the option (3) approach. I hope this PEP will enable the communities involved in OpenGIS, audio processing, image processing, etc to work out the solutions that work for them and their users.
Perhaps one thing that is missing from the PEP is an explicit statement that option (3) is compatible with the manylinux1 tag -- bundling is a valid solution, but it's not the *only* solution.
I've long resisted the notion of defining our own cross-distro platform ABI, but the Docker build environment that was put together for the manylinux project has made me realise that doing that may not be as hellish in a post-Docker world as it would have been in a pre-Docker world. (Since we can go with the specification + reference implementation approach that CPython has used so successfully for so long, rather than having to have the build environment and ABI specification be entirely exhaustive).
On Windows and Mac OS X, our binary compatibility policies for wheel files are actually pretty loose - it's "be binary compatible with the python.org builds for that platform, including linking against the appropriate C standard library", and that's about it. Upgrades to those ABIs are then driven by CPython switching to newer base compatibility levels (dropping end-of-life versions on the Windows side [1], and updating to new deployment target macros on the Mac OS X side). Folks with external dependencies either bundle them, skip publishing wheel files, or just let them fail at import time if the external dependency is missing. (Neither platform has an anti-bundling culture, though, so I assume a lot of folks go with the first option over the last one)
If the aim is to bring Linux wheel support in line with Windows and Mac OS X, then rather than defining a *new* compatibility tag (which would require new pip clients to process), perhaps we could instead adopt a similarly loose policy on what the existing generic "linux" tag means as we have for Windows and Mac OS X: it could just mean wheel files that are binary compatible with the Python binaries in the "manylinux" build environment. The difference would then just be that the target Linux ABI would be defined by PyPA and the manylinux developers, rather than by python-dev.
In terms of the concerns regarding the age of gcc needed to target CentOS 5.11, it would be good to know just what nominating CentOS 6.x as the baseline ABI instead would buy us - CentOS 5 is going on 9 years old (released 2007) and stopped receiving full updates back in 2014 [2], while RHEL/CentOS 6 is just over 5 years old and has another year of full updates left. The CentOS 6 ABI should still be old enough to be compatible with the Debian 6 ABI (current stable is Debian 8), as well as the Ubuntu 12.04 LTS ABI (Ubuntu 16.04 LTS is due out in a few months).
It's probably not worth it to try and reuse the existing linux tags. Prior to the release of pip 8, pip had a hardcoded check to not trust linux wheels from PyPI, even if one was allowed to be uploaded, because we weren't sure if some change was going to be required in pip to support whatever solution we come up with for linux wheels and we didn't want a bunch of old pip's to suddenly start getting wheels they didn't know how to install. After thinking about it more, I decided that installing wheels are pretty simple and it's unlikely pip is going to have to change anything to support that particular tag, so in pip 8 I removed that hack. So it's been less than a week that a version of pip has existed that would install the generic linux platform anyways. There is a similar sort of project to try and make it easy to build cross distro compatible C/C++ things, called the "Holy Build Box" [1] so that certainly lends some extra weight behind the whole idea. Another thing to consider is applicablity outside of Python. When I was talking about the manylinux thing with some friends, one of them mentioned that they were interested in the same thing, but for the Rust language. If we define the manylinux platform, it may make sense to promote it to something that isn't so Python specific and define two things, the manylinux platform (possibly we can buy some domain name for it or something), and then a PEP that just integrates that platform with Python. Might not be worth it though. [1] https://phusion.github.io/holy-build-box/ ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Sun, Jan 24, 2016 at 4:08 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 24 January 2016 at 12:31, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Sat, Jan 23, 2016 at 6:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
1) each package that needs a third partly lib statically links it in. 2) each package that needs a third partly lib provides it, linked with a relative path (IIUC, that's how most Windows packages are done. 3) We establish some standard for providing binary libs as wheels, so that other packages can depend on them and link to them.
In my view, all of these are valid options. I think much of this will need to be worked out by the communities -- especially if individual packages and subcommunities decide to take the option (3) approach. I hope this PEP will enable the communities involved in OpenGIS, audio processing, image processing, etc to work out the solutions that work for them and their users.
Perhaps one thing that is missing from the PEP is an explicit statement that option (3) is compatible with the manylinux1 tag -- bundling is a valid solution, but it's not the *only* solution.
I've long resisted the notion of defining our own cross-distro platform ABI, but the Docker build environment that was put together for the manylinux project has made me realise that doing that may not be as hellish in a post-Docker world as it would have been in a pre-Docker world. (Since we can go with the specification + reference implementation approach that CPython has used so successfully for so long, rather than having to have the build environment and ABI specification be entirely exhaustive).
On Windows and Mac OS X, our binary compatibility policies for wheel files are actually pretty loose - it's "be binary compatible with the python.org builds for that platform, including linking against the appropriate C standard library", and that's about it. Upgrades to those ABIs are then driven by CPython switching to newer base compatibility levels (dropping end-of-life versions on the Windows side [1], and updating to new deployment target macros on the Mac OS X side). Folks with external dependencies either bundle them, skip publishing wheel files, or just let them fail at import time if the external dependency is missing. (Neither platform has an anti-bundling culture, though, so I assume a lot of folks go with the first option over the last one)
If the aim is to bring Linux wheel support in line with Windows and Mac OS X, then rather than defining a *new* compatibility tag (which would require new pip clients to process), perhaps we could instead adopt a similarly loose policy on what the existing generic "linux" tag means as we have for Windows and Mac OS X: it could just mean wheel files that are binary compatible with the Python binaries in the "manylinux" build environment. The difference would then just be that the target Linux ABI would be defined by PyPA and the manylinux developers, rather than by python-dev.
It's an option I guess, though Donald's message below makes it rather less attractive :-). The other thing is that as compared to Windows or OS X, it requires much more attention to actually meet the target Linux ABI -- on Windows or OS X an out-of-the-box build for a simple project will more-often-than-not legitimately meet the ABI, and if you can make a package that also works on your office-mate's computer then it will probably work everywhere. On Linux, the way glibc versioning works means that just doing the obvious 'pip wheel' call will basically never give you a wheel that meets the ABI, and testing on your office-mate's computer proves nothing (except that you're both running Ubuntu 15.10 or whatever). Also, there's a huge quantity of existing linux-tagged wheels out there that definitely don't meet the ABI.
In terms of the concerns regarding the age of gcc needed to target CentOS 5.11, it would be good to know just what nominating CentOS 6.x as the baseline ABI instead would buy us - CentOS 5 is going on 9 years old (released 2007) and stopped receiving full updates back in 2014 [2], while RHEL/CentOS 6 is just over 5 years old and has another year of full updates left. The CentOS 6 ABI should still be old enough to be compatible with the Debian 6 ABI (current stable is Debian 8), as well as the Ubuntu 12.04 LTS ABI (Ubuntu 16.04 LTS is due out in a few months).
AFAICT everyone I've found publishing info on distributing generic Linux binaries is currently using CentOS 5 as their target -- not just manylinux1, but also the Holy Build Box / "travelling ruby" folks, Firefox (not sure exactly what they're using but it seems to be <= CentOS 5), etc. I guess bumping up to CentOS 6 would be trivial enough -- just keep the same library list and bump up the minimum version requirements for glibc / libgcc / libstdc++ -- but I think we'd be pioneers here, and that's something we might not want to be at the same time that we're first dipping our toes into the water :-). GCC 4.8 was released in 2013; it's not actually terribly old. It has decent C++11 support, and it's sufficient to compile things like LLVM and Qt and Firefox. (Compare to Windows, where anyone building py27 wheels gets to use MSVC 2008, which doesn't even know C99.) So I'd be inclined to stick with CentOS 5 for now, and gather some experience while waiting to see how far it can go before it breaks. The one thing that does give me pause is that whenever we *do* decide to switch to manylinux2, then it's going to be a big drag to wait for a whole pip release/upgrade cycle -- Debian unstable is still shipping pip 1.5.6 (released May 2014) :-(. And when it comes to wheel compatibility tags and pip upgrades, the UX is really terrible: if pip is too old to recognize the provided .wheels, then it doesn't tell the user "hey, you should upgrade me" or otherwise provide some hint that there might be a trivial solution to this problem; instead it just silently downloads the source and attempts to build it (and quite often blows up after 30 pegging the CPU for 30 minutes or something). I guess one way to square this circle would be for pip to have some logic that checks for manylinux[0-9]+ platform tags, and if it sees a wheel like this with a platform tag that post-dates its own release, AND the only other option is to build from source, then it tells the user "hey, there's an *excellent* chance that there's a new pip that could give you a wheel right now -- what do you want me to do?". Or we could even make it fail-open rather than fail-closed, like: If pip knows about manylinux 1..n, then given wheels for manylinux (n -1), n, and (n+1), then it should have the preference ordering: n > (n - 1) > (n + 1) i.e., for known platform tags we prefer newer platform tags to older ones; for unknown platform tags from the future, we optimistically assume that they'll probably work (since the whole idea of the manylinux tags is that they will work almost everywhere), but we prefer known tags to unknown tags, so that we only install the manylinux(n+1) wheel if nothing else is available. (And print some message saying what we're doing.) ...well, or maybe just erroring out when it sees the future and asking the user to help would be good enough :-). This would impose the requirement going forward that we'd have to wait for a pip release with support for manylinuxN before allowing manylinuxN onto PyPI, but that doesn't seem too onerous. -n -- Nathaniel J. Smith -- https://vorpus.org

On Jan 24, 2016, at 5:32 PM, Nathaniel Smith <njs@pobox.com> wrote:
The one thing that does give me pause is that whenever we *do* decide to switch to manylinux2, then it's going to be a big drag to wait for a whole pip release/upgrade cycle -- Debian unstable is still shipping pip 1.5.6 (released May 2014) :-(. And when it comes to wheel compatibility tags and pip upgrades, the UX is really terrible: if pip is too old to recognize the provided .wheels, then it doesn't tell the user "hey, you should upgrade me" or otherwise provide some hint that there might be a trivial solution to this problem; instead it just silently downloads the source and attempts to build it (and quite often blows up after 30 pegging the CPU for 30 minutes or something).
Ever since 6.0 pip now implicitly warns users on every invocation if there is a newer version of pip available on PyPI (assuming they are not in some constrained environment that cannot access pypi.python.og). I fully expect Debian to upgrade to pip 8.0 in the near future in sid, it obviously want make it back to Jesse though except via back ports. This is mostly an outlier I think, pip 6+ really ramped up the number of things that pip bundles, plus with it getting bundled with pip it created a bit of a contentious situation with Debian. I believe that the excellent Barry Warsaw is close to having that ironed out and pip 8.0 ready to land, and future upgrades should be pretty painless. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On 25 January 2016 at 08:32, Nathaniel Smith <njs@pobox.com> wrote:
On Sun, Jan 24, 2016 at 4:08 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If the aim is to bring Linux wheel support in line with Windows and Mac OS X, then rather than defining a *new* compatibility tag (which would require new pip clients to process), perhaps we could instead adopt a similarly loose policy on what the existing generic "linux" tag means as we have for Windows and Mac OS X: it could just mean wheel files that are binary compatible with the Python binaries in the "manylinux" build environment. The difference would then just be that the target Linux ABI would be defined by PyPA and the manylinux developers, rather than by python-dev.
It's an option I guess, though Donald's message below makes it rather less attractive :-).
Yeah, I didn't know about the client side block in older versions of pip, so we may as well stick with the custom tag rather than trying to use the existing one.
The other thing is that as compared to Windows or OS X, it requires much more attention to actually meet the target Linux ABI -- on Windows or OS X an out-of-the-box build for a simple project will more-often-than-not legitimately meet the ABI, and if you can make a package that also works on your office-mate's computer then it will probably work everywhere. On Linux, the way glibc versioning works means that just doing the obvious 'pip wheel' call will basically never give you a wheel that meets the ABI, and testing on your office-mate's computer proves nothing (except that you're both running Ubuntu 15.10 or whatever).
That does raise a question for the PEP: should it be proposing changes to the default behaviour of "pip wheel" and "bdist_wheel" on Linux? Even if the answer is "No", then the PEP should probably explain why not.
In terms of the concerns regarding the age of gcc needed to target CentOS 5.11, it would be good to know just what nominating CentOS 6.x as the baseline ABI instead would buy us - CentOS 5 is going on 9 years old (released 2007) and stopped receiving full updates back in 2014 [2], while RHEL/CentOS 6 is just over 5 years old and has another year of full updates left. The CentOS 6 ABI should still be old enough to be compatible with the Debian 6 ABI (current stable is Debian 8), as well as the Ubuntu 12.04 LTS ABI (Ubuntu 16.04 LTS is due out in a few months).
AFAICT everyone I've found publishing info on distributing generic Linux binaries is currently using CentOS 5 as their target -- not just manylinux1, but also the Holy Build Box / "travelling ruby" folks, Firefox (not sure exactly what they're using but it seems to be <= CentOS 5), etc. I guess bumping up to CentOS 6 would be trivial enough -- just keep the same library list and bump up the minimum version requirements for glibc / libgcc / libstdc++ -- but I think we'd be pioneers here, and that's something we might not want to be at the same time that we're first dipping our toes into the water :-).
GCC 4.8 was released in 2013; it's not actually terribly old. It has decent C++11 support, and it's sufficient to compile things like LLVM and Qt and Firefox. (Compare to Windows, where anyone building py27 wheels gets to use MSVC 2008, which doesn't even know C99.) So I'd be inclined to stick with CentOS 5 for now, and gather some experience while waiting to see how far it can go before it breaks.
That argument makes sense to me, so this is possibly worth another note as a "deferred to a later manylinux update" topic, but otherwise doesn't affect the PEP.
The one thing that does give me pause is that whenever we *do* decide to switch to manylinux2, then it's going to be a big drag to wait for a whole pip release/upgrade cycle -- Debian unstable is still shipping pip 1.5.6 (released May 2014) :-(. And when it comes to wheel compatibility tags and pip upgrades, the UX is really terrible: if pip is too old to recognize the provided .wheels, then it doesn't tell the user "hey, you should upgrade me" or otherwise provide some hint that there might be a trivial solution to this problem; instead it just silently downloads the source and attempts to build it (and quite often blows up after 30 pegging the CPU for 30 minutes or something).
I guess one way to square this circle would be for pip to have some logic that checks for manylinux[0-9]+ platform tags, and if it sees a wheel like this with a platform tag that post-dates its own release, AND the only other option is to build from source, then it tells the user "hey, there's an *excellent* chance that there's a new pip that could give you a wheel right now -- what do you want me to do?".
As Donald noted, pip already emits a "Your pip is old" warning at startup when new versions are available from PyPI.
Or we could even make it fail-open rather than fail-closed, like:
If pip knows about manylinux 1..n, then given wheels for manylinux (n -1), n, and (n+1), then it should have the preference ordering: n > (n - 1) > (n + 1) i.e., for known platform tags we prefer newer platform tags to older ones; for unknown platform tags from the future, we optimistically assume that they'll probably work (since the whole idea of the manylinux tags is that they will work almost everywhere), but we prefer known tags to unknown tags, so that we only install the manylinux(n+1) wheel if nothing else is available. (And print some message saying what we're doing.)
...well, or maybe just erroring out when it sees the future and asking the user to help would be good enough :-). This would impose the requirement going forward that we'd have to wait for a pip release with support for manylinuxN before allowing manylinuxN onto PyPI, but that doesn't seem too onerous.
For Windows and Mac OS X, dropping support for old platforms is effectively coupled to the CPython release cycle - a Python 3.5 wheel is less likely to work on Windows XP than a 3.3 one, for example, since 3.5 doesn't support XP, while 3.3 does. We could potentially consider something similar for manylinux: scoping the evolution of the definition by corresponding CPython release, rather than giving it its own version number. So the first release of manylinux would define the ABI for Linux wheels for 3.5 and earlier (including the 2.x series), and then 3.6 would be the next chance to consider revising it (e.g. by bumping up the base ABI from CentOS 5 to CentOS 6). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Jan 24, 2016 at 8:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote: [...]
...well, or maybe just erroring out when it sees the future and asking the user to help would be good enough :-). This would impose the requirement going forward that we'd have to wait for a pip release with support for manylinuxN before allowing manylinuxN onto PyPI, but that doesn't seem too onerous.
For Windows and Mac OS X, dropping support for old platforms is effectively coupled to the CPython release cycle - a Python 3.5 wheel is less likely to work on Windows XP than a 3.3 one, for example, since 3.5 doesn't support XP, while 3.3 does.
We could potentially consider something similar for manylinux: scoping the evolution of the definition by corresponding CPython release, rather than giving it its own version number.
So the first release of manylinux would define the ABI for Linux wheels for 3.5 and earlier (including the 2.x series), and then 3.6 would be the next chance to consider revising it (e.g. by bumping up the base ABI from CentOS 5 to CentOS 6).
The problem with this is that python 2.7 is going to be supported and widely used until well past the EOL of CentOS 5, and maybe even past the EOL of CentOS 6 (in 2020). On Linux, unlike Windows, the whole system ABI underneath Python evolves in a backwards-compatible way, so there's no anchor like the MSVC CRT that's tying us to old ABIs. (And on OS X, the OS X version number is encoded as part of the platform tag, similar to the proposal for manylinux1/2/...) -n -- Nathaniel J. Smith -- https://vorpus.org

(e.g. by bumping up
the base ABI from CentOS 5 to CentOS 6).
The problem with this is that python 2.7 is going to be supported and widely used until well past the EOL of CentOS 5, and maybe even past the EOL of CentOS 6
Given that we're starting now ( not a year or two ago) and it'll take a while for it to really catch on, we should go CentOS 6 ( or equivalent ) now? CentOS5 was released in 2007! That is a pretty long time in computing. Just a thought, we'll be locked into it for a while, yes? -CHB
(in 2020). On Linux, unlike Windows, the whole system ABI underneath Python evolves in a backwards-compatible way, so there's no anchor like the MSVC CRT that's tying us to old ABIs.
(And on OS X, the OS X version number is encoded as part of the platform tag, similar to the proposal for manylinux1/2/...)
-n
-- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On Mon, Jan 25, 2016 at 10:29 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
Given that we're starting now ( not a year or two ago) and it'll take a while for it to really catch on, we should go CentOS 6 ( or equivalent ) now?
CentOS5 was released in 2007! That is a pretty long time in computing.
I understand the concern, but I think we should follow the lead of the other projects that have been doing portable linux binaries (holy build box, traveling ruby, portable-pypy, firefox, enthought, continuum) for some time, all based on CentOS 5. At some point things like C++17 support will be important and I agree that we'll need to update the base spec, but in the meantime, I don't see this as a problem where we should be the first mover. -Robert

On 26 January 2016 at 16:49, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Mon, Jan 25, 2016 at 10:29 PM, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
Given that we're starting now ( not a year or two ago) and it'll take a while for it to really catch on, we should go CentOS 6 ( or equivalent ) now?
CentOS5 was released in 2007! That is a pretty long time in computing.
I understand the concern, but I think we should follow the lead of the other projects that have been doing portable linux binaries (holy build box, traveling ruby, portable-pypy, firefox, enthought, continuum) for some time, all based on CentOS 5. At some point things like C++17 support will be important and I agree that we'll need to update the base spec, but in the meantime, I don't see this as a problem where we should be the first mover.
I was discussing this with some of the folks that are responsible for defining the RHEL (and hence CentOS) ABI, and they pointed out that the main near term advantage of targeting CentOS 6 over CentOS 5 is that it means that it would be possible to analyse binaries built that way with the libabigail tools, including abicompat: https://sourceware.org/libabigail/manual/abicompat.html If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work. So if we went down the CentOS 6+ path, it should make it possible to take binaries built on the reference environment and use abicompat to check them against libraries from another distro, and vice-versa. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, 26 Jan 2016 20:36:26 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work.
If you build on CentOS 5, you certainly want to use the RH developer toolset 2 which gives you a modern-ish toolchain (gcc 4.8.2, IIRC). Regards Antoine.

Isnt the entire point of using centos 5 to use an ancient toolchain? On 1/26/2016 06:44, Antoine Pitrou wrote:
On Tue, 26 Jan 2016 20:36:26 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work. If you build on CentOS 5, you certainly want to use the RH developer toolset 2 which gives you a modern-ish toolchain (gcc 4.8.2, IIRC).
Regards
Antoine.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On Tue, 26 Jan 2016 06:50:15 -0500 Alexander Walters <tritium-list@sdamon.com> wrote:
Isnt the entire point of using centos 5 to use an ancient toolchain?
No, the point is to link against an ancient glibc. The toolchain can be modern (and actually has to if you want to compile e.g. C++11 code). Regards Antoine.

On Tue, Jan 26, 2016 at 11:44 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 26 Jan 2016 20:36:26 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work.
If you build on CentOS 5, you certainly want to use the RH developer toolset 2 which gives you a modern-ish toolchain (gcc 4.8.2, IIRC).
Indeed, C++11 is the main reason why I added devtoolsset 2 in the `Dockerfile` that was part of what originated the manylinux effort. David
Regards
Antoine.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On 26 January 2016 at 21:44, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 26 Jan 2016 20:36:26 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work.
If you build on CentOS 5, you certainly want to use the RH developer toolset 2 which gives you a modern-ish toolchain (gcc 4.8.2, IIRC).
Yeah, that's the part I haven't clarified yet - whether it was just the base CentOS toolchain that was too old, or devtoolset-2. If abicompat et al all work with devtoolset-2, then it definitely makes sense to stick with the CentOS 5 baseline for now. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 26 January 2016 at 23:17, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 26 January 2016 at 21:44, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 26 Jan 2016 20:36:26 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
If I understand the problem correctly, the CentOS 5 gcc toolchain is old enough that it simply doesn't emit the info libabigail needs in order to work.
If you build on CentOS 5, you certainly want to use the RH developer toolset 2 which gives you a modern-ish toolchain (gcc 4.8.2, IIRC).
Yeah, that's the part I haven't clarified yet - whether it was just the base CentOS toolchain that was too old, or devtoolset-2.
If abicompat et al all work with devtoolset-2, then it definitely makes sense to stick with the CentOS 5 baseline for now.
I followed this up with the ABI folks, and the problem is that the elfutils in even DTS 2 is too old to support building libabigail, and later versions of the developer toolset (3 & 4) don't support being run on CentOS 5. However, even if the build system is based on CentOS 5, *compatibility scanners* like auditwheel can potentially be run on something newer, so I've asked if it would work to use the older toolchain to build the binaries, but then run the relevant ABI compatibility checks on the almost-certainly-newer target distro. If that's the case, then folks would be able to run a *static* abicompat check over a virtualenv including pre-built extensions from PyPI and be alerted to ABI compatibility problems, rather than getting hard-to-debug segfaults at runtime. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27 January 2016 at 22:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
I followed this up with the ABI folks, and the problem is that the elfutils in even DTS 2 is too old to support building libabigail, and later versions of the developer toolset (3 & 4) don't support being run on CentOS 5.
However, even if the build system is based on CentOS 5, *compatibility scanners* like auditwheel can potentially be run on something newer, so I've asked if it would work to use the older toolchain to build the binaries, but then run the relevant ABI compatibility checks on the almost-certainly-newer target distro.
If that's the case, then folks would be able to run a *static* abicompat check over a virtualenv including pre-built extensions from PyPI and be alerted to ABI compatibility problems, rather than getting hard-to-debug segfaults at runtime.
Good news! The toolchain that matters for libabigail based compatibility scans is the one used to run the scan, *not* the one used to build the binaries. This means that even though libabigail itself can't be used on CentOS 5, it *can* be used to check if specific binaries built on CentOS 5 are ABI compatible with your current distro, which is what we actually want people to be able to do. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Jan 25, 2016 at 10:49 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Mon, Jan 25, 2016 at 10:29 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
Given that we're starting now ( not a year or two ago) and it'll take a while for it to really catch on, we should go CentOS 6 ( or equivalent ) now?
CentOS5 was released in 2007! That is a pretty long time in computing.
I understand the concern, but I think we should follow the lead of the other projects that have been doing portable linux binaries (holy build box, traveling ruby, portable-pypy, firefox, enthought, continuum) for some time, all based on CentOS 5.
That's the point -- they have been doing it for some time -- some time ago, you really wouls want a version that old. The question is -- how many systems are there in teh wild now that are older than CentOS6? I have no idea how to even find out. But if we're starting somethign new, why start with what was appropriate 2+ years ago? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Jan 26, 2016 at 11:47 AM, Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Jan 25, 2016 at 10:49 PM, Robert T. McGibbon <rmcgibbo@gmail.com> wrote:
On Mon, Jan 25, 2016 at 10:29 PM, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
Given that we're starting now ( not a year or two ago) and it'll take a while for it to really catch on, we should go CentOS 6 ( or equivalent ) now?
CentOS5 was released in 2007! That is a pretty long time in computing.
I understand the concern, but I think we should follow the lead of the other projects that have been doing portable linux binaries (holy build box, traveling ruby, portable-pypy, firefox, enthought, continuum) for some time, all based on CentOS 5.
That's the point -- they have been doing it for some time -- some time ago, you really wouls want a version that old.
The question is -- how many systems are there in teh wild now that are older than CentOS6? I have no idea how to even find out. But if we're starting somethign new, why start with what was appropriate 2+ years ago?
Well, the people who know what they're doing are still recommending CentOS 5 today, and we don't know what we're doing :-). Transitioning to a CentOS6-based manylinux2 shouldn't be a huge problem -- basically it just requires a pip release + a tweak to pypi to allow them, and then projects will probably want to provide manylinux1 and manylinux2 wheels alongside each other for 6 months or a year to give people a chance to upgrade their pip. -n -- Nathaniel J. Smith -- https://vorpus.org

On Tue, Jan 26, 2016 at 3:56 PM, Nathaniel Smith <njs@pobox.com> wrote:
Well, the people who know what they're doing are still recommending CentOS 5 today, and we don't know what we're doing :-).
well, yes, there is that. I sure don't. But the threshold for changing is higher than for starting fresh. Transitioning to a CentOS6-based manylinux2 shouldn't be a huge
problem --
would CentOS5-based wheels run just fine on a centOS-6 based system ? Alongside CEntOS6-based wheels? If so, then I guess it's no biggie -- the safe bet is better. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Jan 26, 2016 at 4:02 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Tue, Jan 26, 2016 at 3:56 PM, Nathaniel Smith <njs@pobox.com> wrote:
Well, the people who know what they're doing are still recommending CentOS 5 today, and we don't know what we're doing :-).
well, yes, there is that. I sure don't. But the threshold for changing is higher than for starting fresh.
Transitioning to a CentOS6-based manylinux2 shouldn't be a huge problem --
would CentOS5-based wheels run just fine on a centOS-6 based system ? Alongside CEntOS6-based wheels?
Yes, the whole idea is that CentOS5-based wheels will run basically everywhere :-)
If so, then I guess it's no biggie -- the safe bet is better.
-n -- Nathaniel J. Smith -- https://vorpus.org

On Wed, Jan 27, 2016 at 12:02 AM, Chris Barker <chris.barker@noaa.gov> wrote:
On Tue, Jan 26, 2016 at 3:56 PM, Nathaniel Smith <njs@pobox.com> wrote:
Well, the people who know what they're doing are still recommending CentOS 5 today, and we don't know what we're doing :-).
well, yes, there is that. I sure don't. But the threshold for changing is higher than for starting fresh.
Transitioning to a CentOS6-based manylinux2 shouldn't be a huge
problem --
would CentOS5-based wheels run just fine on a centOS-6 based system ? Alongside CEntOS6-based wheels?
If so, then I guess it's no biggie -- the safe bet is better.
I will make sure to let the manylinux effort know when we decide to move to Centos6 as the base system. David
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On Wed, Jan 27, 2016 at 1:37 AM, David Cournapeau <cournape@gmail.com> wrote:
I will make sure to let the manylinux effort know when we decide to move to Centos6 as the base system.
Thanks -- do you have any idea how many of your customers are running systems that old? i.e. have you stuck with CentOS5 because of actual customer demand as opposed to uncertainty or inertia? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Wed, Jan 27, 2016 at 4:37 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Wed, Jan 27, 2016 at 1:37 AM, David Cournapeau <cournape@gmail.com> wrote:
I will make sure to let the manylinux effort know when we decide to move to Centos6 as the base system.
Thanks -- do you have any idea how many of your customers are running systems that old?
i.e. have you stuck with CentOS5 because of actual customer demand as opposed to uncertainty or inertia?
As mentioned by others before, Centos5 is a good way to ensure we link against an old glibc (and few other key libraries, mostly X11-related). That's really the main thing, as in general, you want to depend on the system as little as possible when deploying binaries on Linux. Centos 6 uses glibc 2.12, which is newer than debian 6 and ubuntu 10.04 versions. Even if debian 6 is still old, we see it on systems, and ubuntu 10.04 LTS is definitely still out there in companies, even if officially unsupported. And unsupported old versions of OS are used much more often than you may think in enterprise (can't give names but companies anyone has heard of still rely a lot on windows XP). So now, one could argue that it is not the community's job to tackle old OS, and they would be right, but: 1. Updating to e.g. 6 does not help that much, as the basic components (compiler toolchain) are still old. 2. Updating the toolchain even on centos 5 is quite easy thanks to the devtoolset effort. The main argument against using centos 5 is GUI-related components, as the old fontconfig/glib (the GTK one, not Gnu libc) are a problem. But those are a tiny minority of what people do with python nowadays, and they require a lot of work to get right. David
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov

On Jan 27, 2016, at 12:00 PM, David Cournapeau <cournape@gmail.com> wrote:
So now, one could argue that it is not the community's job to tackle old OS, and they would be right, but:
We can make a data driven decision here. Here is the top 100 *nix OSs that are downloading files from PyPI using a version of pip >= 6.0. Any version of pip older than that and we don’t have insight into what they are (we can get kernel version if they’re using pip >= 1.4) and any installer other than that we don’t get any insight into either. One problem of course is deciding how representative only people who are using pip >= 6.0 is, though since we can’t get manylinux support into already released versions of pip it may be pretty representative of people who will use this feature (unless this feature causes people to upgrade their pip when they wouldn’t otherwise). Anyways, here’s the data: https://gist.github.com/dstufft/e1b1fbebb3482362198f <https://gist.github.com/dstufft/e1b1fbebb3482362198f> It doesn’t matter to me if we use CentOS5 or CentOS6 as our base, but having some information to inform our choices is never a bad thing! ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Wed, Jan 27, 2016 at 5:18 PM, Donald Stufft <donald@stufft.io> wrote:
On Jan 27, 2016, at 12:00 PM, David Cournapeau <cournape@gmail.com> wrote:
So now, one could argue that it is not the community's job to tackle old OS, and they would be right, but:
We can make a data driven decision here.
I like data-driven decision as well :) Can you give an approximate of total download to convert this in % ? David
Here is the top 100 *nix OSs that are downloading files from PyPI using a version of pip >= 6.0. Any version of pip older than that and we don’t have insight into what they are (we can get kernel version if they’re using pip
= 1.4) and any installer other than that we don’t get any insight into either.
One problem of course is deciding how representative only people who are using pip >= 6.0 is, though since we can’t get manylinux support into already released versions of pip it may be pretty representative of people who will use this feature (unless this feature causes people to upgrade their pip when they wouldn’t otherwise).
Anyways, here’s the data: https://gist.github.com/dstufft/e1b1fbebb3482362198f
It doesn’t matter to me if we use CentOS5 or CentOS6 as our base, but having some information to inform our choices is never a bad thing!
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 27, 2016, at 12:29 PM, David Cournapeau <cournape@gmail.com> wrote:
On Wed, Jan 27, 2016 at 5:18 PM, Donald Stufft <donald@stufft.io <mailto:donald@stufft.io>> wrote:
On Jan 27, 2016, at 12:00 PM, David Cournapeau <cournape@gmail.com <mailto:cournape@gmail.com>> wrote:
So now, one could argue that it is not the community's job to tackle old OS, and they would be right, but:
We can make a data driven decision here.
I like data-driven decision as well :) Can you give an approximate of total download to convert this in % ?
David
Here is the top 100 *nix OSs that are downloading files from PyPI using a version of pip >= 6.0. Any version of pip older than that and we don’t have insight into what they are (we can get kernel version if they’re using pip >= 1.4) and any installer other than that we don’t get any insight into either.
One problem of course is deciding how representative only people who are using pip >= 6.0 is, though since we can’t get manylinux support into already released versions of pip it may be pretty representative of people who will use this feature (unless this feature causes people to upgrade their pip when they wouldn’t otherwise).
Anyways, here’s the data: https://gist.github.com/dstufft/e1b1fbebb3482362198f <https://gist.github.com/dstufft/e1b1fbebb3482362198f>
It doesn’t matter to me if we use CentOS5 or CentOS6 as our base, but having some information to inform our choices is never a bad thing!
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
Across all of PyPI? Total Downloads are 119890375 for roughly the same time period (data is continuously streaming in), for only Linux on pip >= 6.0 it’s 40210457. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 27, 2016 09:18, "Donald Stufft" <donald@stufft.io> wrote:
On Jan 27, 2016, at 12:00 PM, David Cournapeau <cournape@gmail.com>
wrote:
So now, one could argue that it is not the community's job to tackle old
OS, and they would be right, but:
We can make a data driven decision here.
Here is the top 100 *nix OSs that are downloading files from PyPI using a version of pip >= 6.0. Any version of pip older than that and we don’t have insight into what they are (we can get kernel version if they’re using pip = 1.4) and any installer other than that we don’t get any insight into either.
One problem of course is deciding how representative only people who are using pip >= 6.0 is, though since we can’t get manylinux support into already released versions of pip it may be pretty representative of people who will use this feature (unless this feature causes people to upgrade
Is the kernel version table you mentioned trivial to get? I bet it's very closely correlated with glibc version. their pip when they wouldn’t otherwise).
Anyways, here’s the data:
https://gist.github.com/dstufft/e1b1fbebb3482362198f One short summary: "wow, there are a ton of Debian 6 users out there". Esp. considering that even Debian unstable isn't shipping pip 6.0 yet. (Or who knows, I guess that could just be one very heavy user.)
It doesn’t matter to me if we use CentOS5 or CentOS6 as our base, but having some information to inform our choices is never a bad thing!
Indeed! -n

On Jan 27, 2016, at 12:37 PM, Nathaniel Smith <njs@vorpus.org> wrote:
Is the kernel version table you mentioned trivial to get? I bet it's very closely correlated with glibc version.
https://gist.github.com/dstufft/1eaa826361ac5b755f17 <https://gist.github.com/dstufft/1eaa826361ac5b755f17> ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Tl;dr: Looks like the oldest kernel that makes the top 100 list is 2.6.32, which is used in both RHEL6 and Debian 6. On Jan 27, 2016 9:50 AM, "Donald Stufft" <donald@stufft.io> wrote:
On Jan 27, 2016, at 12:37 PM, Nathaniel Smith <njs@vorpus.org> wrote:
Is the kernel version table you mentioned trivial to get? I bet it's very closely correlated with glibc version.
https://gist.github.com/dstufft/1eaa826361ac5b755f17
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jan 27, 2016 09:00, "David Cournapeau" <cournape@gmail.com> wrote:
The main argument against using centos 5 is GUI-related components, as
[...] the old fontconfig/glib (the GTK one, not Gnu libc) are a problem. But those are a tiny minority of what people do with python nowadays, and they require a lot of work to get right. This is the part that intrigued me :-). Can you elaborate at all on what kind of problems you've encountered with fontconfig and glib? (Well, I can guess at one piece of excitement maybe: that glib is not really vendorable because a process can have only one main loop, and that lives in glib these days for both gtk and Qt, so any packages doing any GUI stuff are required to agree to use the same glib version.) -n

On Wed, Jan 27, 2016 at 5:43 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Jan 27, 2016 09:00, "David Cournapeau" <cournape@gmail.com> wrote:
The main argument against using centos 5 is GUI-related components, as
[...] the old fontconfig/glib (the GTK one, not Gnu libc) are a problem. But those are a tiny minority of what people do with python nowadays, and they require a lot of work to get right.
This is the part that intrigued me :-). Can you elaborate at all on what kind of problems you've encountered with fontconfig and glib?
(Well, I can guess at one piece of excitement maybe: that glib is not really vendorable because a process can have only one main loop, and that lives in glib these days for both gtk and Qt, so any packages doing any GUI stuff are required to agree to use the same glib version.)
So vendoring glib is possible (we actually do it ATM, though we may revert that). The problem is that now when you load say PyQt w/ Qt linked against your vendored glib, you get into issues if that glib is higher than the glib used in the system (that happens through pango IIRC). So if you want to stay compatible, you need to build an old glib, which is what you were trying to avoid in the first place. There is no good solution, really David
-n

On Sat, Jan 23, 2016 at 6:19 PM, Chris Barker <chris.barker@noaa.gov> wrote:
OK,
I'll try to stop being emotional here :-)
2016-01-22 3:47 GMT+01:00 Chris Barker - NOAA Federal <chris.barker@noaa.gov>:
I'm skeptical because I tried to to that for years for OS-X and it was just too much to do. And the infrastructure was there.
My point is that once we have clearly defined best-practices for packaging and convenient tools to build the packages automatically and test that they work as expected (e.g. free hosted CI that support running an old centos-based docker container), I am rather confident that the community will do the work.
OK -- here is the emotional part -- I worked for years to try to get support to:
"clearly defined best-practices for packaging and convenient tools to build the packages automatically"
Primarily for OS-X. I got zero support -- nada -- nothing. Really. A handful of people did their own thing to support the community, but no cooperation of standards -- each package built was done with a lot of hand work and each in its own way. So when i found the conda community working on common tools and methods, it was very refreshing.
I hear that :-/. But it feels like there is some momentum building these days -- before the only people who cared about the scientific stack + wheels were people doing Mac development, so there are nice Mac docs like: https://github.com/MacPython/wiki/wiki/Wheel-building and nothing more general. But now we've got Linux and Windows both ramping up, to the point that I'm having trouble juggling stuff across the different mailing lists and trying to keep people in the loop. So... obviously the solution is to create yet another mailing list :-). Maybe we need wheel-builders-sig? Their mandate would be to hash out things like how to build binary-libraries-wrapped-up-in-wheels, share knowledge about the minutiae of linker behavior on different platforms (oh god there's so much minutiae), maintain tools like delocate and auditwheel (and whatever the equivalent will be for windows... and do we really need 3 different tools?), collect knowledge from where it's scattered now and put it into the guide at packaging.python.org [1], etc.? It seems a bit outside distutils-sig's focus in practice, since this would all be about third-party tools and individual package authors as opposed to distutils-sig's focus on writing interoperability PEPs and maintaining the core python.org-affiliated infrastructure like PyPI / setuptools / pip. -n [1] currently the "official guide" to building binary wheels is this :-): https://packaging.python.org/en/latest/extensions/#publishing-binary-extensi... -- Nathaniel J. Smith -- https://vorpus.org

On Jan 24, 2016, at 7:21 PM, Nathaniel Smith <njs@pobox.com> wrote:
Maybe we need wheel-builders-sig? Their mandate would be to hash out things like how to build binary-libraries-wrapped-up-in-wheels, share knowledge about the minutiae of linker behavior on different platforms (oh god there's so much minutiae), maintain tools like delocate and auditwheel (and whatever the equivalent will be for windows... and do we really need 3 different tools?), collect knowledge from where it's scattered now and put it into the guide at packaging.python.org [1], etc.? It seems a bit outside distutils-sig's focus in practice, since this would all be about third-party tools and individual package authors as opposed to distutils-sig's focus on writing interoperability PEPs and maintaining the core python.org-affiliated infrastructure like PyPI / setuptools / pip.
I’m not about to tell someone they *have* to use distutils-sig, but I think at this point it’s perfectly reasonable to discuss things of this nature here as well. distutils-sig would really be more accurate if it was named packaging-sig, but historical reasons ;) ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On 25 January 2016 at 10:23, Donald Stufft <donald@stufft.io> wrote:
On Jan 24, 2016, at 7:21 PM, Nathaniel Smith <njs@pobox.com> wrote:
Maybe we need wheel-builders-sig? Their mandate would be to hash out things like how to build binary-libraries-wrapped-up-in-wheels, share knowledge about the minutiae of linker behavior on different platforms (oh god there's so much minutiae), maintain tools like delocate and auditwheel (and whatever the equivalent will be for windows... and do we really need 3 different tools?), collect knowledge from where it's scattered now and put it into the guide at packaging.python.org [1], etc.? It seems a bit outside distutils-sig's focus in practice, since this would all be about third-party tools and individual package authors as opposed to distutils-sig's focus on writing interoperability PEPs and maintaining the core python.org-affiliated infrastructure like PyPI / setuptools / pip.
I’m not about to tell someone they *have* to use distutils-sig, but I think at this point it’s perfectly reasonable to discuss things of this nature here as well.
I agree with this - while I wouldn't call any aspect of software distribution easy, binary compatibility is one of the hardest, and we want *more* knowledge sharing in that regard, rather than continue to maintain the divide between "folks who care about binary compatibility" and "folks who wish everyone would just stop creating extension modules already because pure Python modules are so much easier to deal with" :) Keeping the discussions centralised also creates more opportunities for serendipitous collaboration, where folks notice that they can potentially help out with what someone else is working on.
distutils-sig would really be more accurate if it was named packaging-sig, but historical reasons ;)
I kind of look forward to a distant future where we *do* decide to rename it, as that will mean all of the more pressing concerns have already been knocked off the todo list :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Jan 24, 2016 at 8:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 25 January 2016 at 10:23, Donald Stufft <donald@stufft.io> wrote:
On Jan 24, 2016, at 7:21 PM, Nathaniel Smith <njs@pobox.com> wrote:
Maybe we need wheel-builders-sig? Their mandate would be to hash out things like how to build binary-libraries-wrapped-up-in-wheels, share knowledge about the minutiae of linker behavior on different platforms (oh god there's so much minutiae), maintain tools like delocate and auditwheel (and whatever the equivalent will be for windows... and do we really need 3 different tools?), collect knowledge from where it's scattered now and put it into the guide at packaging.python.org [1], etc.? It seems a bit outside distutils-sig's focus in practice, since this would all be about third-party tools and individual package authors as opposed to distutils-sig's focus on writing interoperability PEPs and maintaining the core python.org-affiliated infrastructure like PyPI / setuptools / pip.
I’m not about to tell someone they *have* to use distutils-sig, but I think at this point it’s perfectly reasonable to discuss things of this nature here as well.
I agree with this - while I wouldn't call any aspect of software distribution easy, binary compatibility is one of the hardest, and we want *more* knowledge sharing in that regard, rather than continue to maintain the divide between "folks who care about binary compatibility" and "folks who wish everyone would just stop creating extension modules already because pure Python modules are so much easier to deal with" :)
Keeping the discussions centralised also creates more opportunities for serendipitous collaboration, where folks notice that they can potentially help out with what someone else is working on.
I'm definitely in favor of centralising this kind of knowledge and initiative upstream with the core Python community. The whole manylinux concept falls into this category of "hey let's move take this experience with scientific packages and port it upstream" (and btw if/when PEP 513 is accepted we should perhaps consider moving the related tools into the pypa/ namespace?). I guess my concern, though, is that distutils-sig has historically been a rather contentious place. It's definitely not as bad these days as its old reputation would suggest (much thanks to everyone who's helped make that happen!). But I guess some amount of this is intrinsic to its nature: the core infrastructure like distutils, pip, PyPI, interoperability PEPs, is stuff that people with wildly varying backgrounds and goals all *have* to use, and so we're kinda all trapped here together. In a way, this kind of contention is a good thing, because it forces this core infrastructure to do a better job of serving all its many stakeholders. But OTOH I'm not sure it's very conducive to making quick collaborative progress on focused topics that don't need broad community consensus. (I'm not sure where day-to-day discussion of routine changes to pip and warehouse and setuptools is happening, but I do notice that it doesn't seem to be here. And similarly there's a reason that the initial hashing out of the manylinux stuff happened off-list -- if only because you didn't need to see the blow-by-blow of our futzing about making docker work ;-).) -n -- Nathaniel J. Smith -- https://vorpus.org

On 21.01.2016 04:55, Nathaniel Smith wrote: the choice of compiler is questionable. Just a pick into a release series. Not even the last released version on this release series. Is this a good choice? Maybe for x86_64 and i386, but not for anything else.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5
this doesn't include libtinfo, dependency of libncursesw.
libgcc_s.so.1 libstdc++.so.6
if you insist on a version built from before GCC 5, you are in trouble with the libstdc++ ABI. So maybe link that one statically.
libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
while glib2.0 is somehow frozen, what will you do if these change the soname? libgfortran is missing from this list while later down you mention gfortran. This will include libquadmath too. Any reason to not list libz?
Compilation and Tooling =======================
so how are people supposed to build these wheels? will you provide a development distribution, or do you "trust" people building such wheels?
Platform Detection for Installers =================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
add: "A Linux distribution that is built with clang", e.g. Mageia (libc++ instead of libstdc++).
Security Implications =====================
One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
so you rely on the python build to provide this and access OpenSSL through the standard library? From my point of view this draft is too much influenced by Anaconda and their needs. It doesn't talk at all about interaction with other system libraries, or interaction with extensions provided by distributions. Matthias

On Thu, Jan 21, 2016 at 12:18 PM, Matthias Klose <doko@ubuntu.com> wrote:
On 21.01.2016 04:55, Nathaniel Smith wrote:
the choice of compiler is questionable. Just a pick into a release series. Not even the last released version on this release series. Is this a good choice? Maybe for x86_64 and i386, but not for anything else.
There's no mandatory compiler. The rule is basically: "if your wheel works when given access to CentOS 5's versions of the following packages: ..., then your wheel is manylinux1 compatible". Any method for achieving that is fair game :-). The RH devtoolset fork of gcc 4.8 happens to be the only C++11-supporting compiler that we know of that can produce binaries that accomplish that, but if you have another compiler that works better for you then go for it.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5
this doesn't include libtinfo, dependency of libncursesw.
libgcc_s.so.1 libstdc++.so.6
if you insist on a version built from before GCC 5, you are in trouble with the libstdc++ ABI. So maybe link that one statically.
The libstdc++ ABI is forward compatible across the GCC 4/GCC 5 transition; the exception is if you want to pass C++ stdlib objects directly between code compiled with GCC 4 and code compiled with GCC 5. This is extremely rare, since Python packages almost never provide a C++ ABI to other Python packages. (I don't know of any examples of this.) If it does come up then those package authors will have to figure out how they want to handle it, yes, possibly via picking a more modern libstdc++ and agreeing to all use it together.
libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0
while glib2.0 is somehow frozen, what will you do if these change the soname?
libgfortran is missing from this list while later down you mention gfortran. This will include libquadmath too.
libgfortran is missing intentionally, because it turns out from experience that lots of end-user systems don't have it installed out of the box. You'll notice that the numpy wheels I posted earlier include libgfortran inside them. (libquadmath is optional though -- you can build gfortran with --disable-libquadmath-support and get a toolchain that generates binaries that don't depend on libquadmath.)
Any reason to not list libz?
Compilation and Tooling =======================
so how are people supposed to build these wheels? will you provide a development distribution, or do you "trust" people building such wheels?
We currently provide a docker image (docker pull quay.io/manylinux/manylinux) and a tool that checks wheels for compliance and can in many cases automatically vendor any needed shared libraries (pip3 install auditwheel). In the end of course we can't force people to use these tools -- there's nothing that will actually stop someone from renaming a Windows wheel to say "manylinux1" in the platform tag and uploading it somewhere :-) -- but our experience with OS X wheels is that if you provide good docs and tools then people will generally get it right. Package authors generally don't set out to build broken packages that frustrate users ;-)
Platform Detection for Installers =================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
add: "A Linux distribution that is built with clang", e.g. Mageia (libc++ instead of libstdc++).
Interesting point. It looks like from some quick googling that even Mageia does provide a libstdc++6 package, though, presumably because of these kinds of compatibility issues?
Security Implications =====================
One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
so you rely on the python build to provide this and access OpenSSL through the standard library?
That is one strategy that's available to packages who need access to OpenSSL, yes.
From my point of view this draft is too much influenced by Anaconda and their needs. It doesn't talk at all about interaction with other system libraries, or interaction with extensions provided by distributions.
It's true that it doesn't do those things, just like there's no PEP talking about how pip on Windows should interact with nuget or how pip on OS X should interact with homebrew. This is a first step, the perfect is the enemy of the good, etc. -n -- Nathaniel J. Smith -- https://vorpus.org

On Thu, Jan 21, 2016 at 8:18 PM, Matthias Klose <doko@ubuntu.com> wrote:
On 21.01.2016 04:55, Nathaniel Smith wrote:
the choice of compiler is questionable. Just a pick into a release series. Not even the last released version on this release series. Is this a good choice? Maybe for x86_64 and i386, but not for anything else.
The permitted external shared libraries are: ::
libpanelw.so.5 libncursesw.so.5
this doesn't include libtinfo, dependency of libncursesw.
libgcc_s.so.1
libstdc++.so.6
if you insist on a version built from before GCC 5, you are in trouble with the libstdc++ ABI. So maybe link that one statically.
libgobject-2.0.so.0
libgthread-2.0.so.0 libglib-2.0.so.0
while glib2.0 is somehow frozen, what will you do if these change the soname?
libgfortran is missing from this list while later down you mention gfortran. This will include libquadmath too.
Any reason to not list libz?
Compilation and Tooling
=======================
so how are people supposed to build these wheels? will you provide a development distribution, or do you "trust" people building such wheels?
Platform Detection for Installers
=================================
Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise.
We know of three main sources of potential incompatibility that are likely to arise in practice:
* A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile
add: "A Linux distribution that is built with clang", e.g. Mageia (libc++ instead of libstdc++).
Security Implications
=====================
One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
so you rely on the python build to provide this and access OpenSSL through the standard library?
From my point of view this draft is too much influenced by Anaconda and their needs. It doesn't talk at all about interaction with other system libraries, or interaction with extensions provided by distributions.
FWIW, the list of libraries and the dockerfile was originally built from communications I had w/ Nathaniel and Matthew Brett, and I work for a continuum's competitor, so you can be fairly confident that there is no hidden agenda. David
Matthias
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

Nathaniel, Robert, I'm really excited to see how quickly you're making progress. A few comments below as I haven't had a chance to catch up on the day's discussion: On Wed, Jan 20, 2016 at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels.
Just a standards question: does this still require an update to PEP 425, or would the definition of the manylinux1 platform here supersede that section of 425? * Eventually, in the future, there may exist distributions that break
compatibility with this profile
To handle the third case, we propose the creation of a file ``/etc/python/compatibility.cfg`` in ConfigParser format, with sample contents: ::
[manylinux1] compatible = true
where the supported values for the ``manylinux1.compatible`` entry are the same as those supported by the ConfigParser ``getboolean`` method.
Could this instead use the more powerful json-based syntax proposed by Nick here: https://mail.python.org/pipermail/distutils-sig/2015-July/026617.html I have already implemented support for this in pip and wheel.
Security Implications =====================
One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile.
I appreciate that this was addressed. I don't want to be responsible for keeping the versions of these things up to date. So instead, I made a docker-based build system that builds a ton of wheels on different distros/versions using a common definition. But I appreciate that it's a bit heavy and not everyone will prefer this. One piece that is not yet complete in the work I've done so far is actually ensuring that the external dependencies are installed, and providing some feedback on what's missing. But that can be done.
Rejected Alternatives =====================
One alternative would be to provide separate platform tags for each Linux distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``, ``debian_jessie``, etc. Nothing in this proposal rules out the possibility of adding such platform tags in the future, or of further extensions to wheel metadata that would allow wheels to declare dependencies on external system-installed packages. However, such extensions would require substantially more work than this proposal, and still might not be appreciated by package developers who would prefer not to have to maintain multiple build environments and build multiple wheels in order to cover all the common Linux distributions. Therefore we consider such proposals to be out-of-scope for this PEP.
;) For anyone who's interested, the next release of Galaxy (a popular bioinformatics framework for running tools, workflows, etc.), due next week, will ship with our modified pip that includes support for distro/version-specific platform tags in wheels. All but one of our dependent package's wheels are built with the generic `linux_x86_64` tag on Debian Squeeze and will work with most distros, though, so we're basically doing a "loose" version of manylinux1. Only our psycopg2 wheels are built per-distro/version. I'm happy to see a more rigid definition for what we're doing with the "generic" ones, this is certainly necessary should support for generic Linux wheels ever be allowed into PyPI. manylinux1 and this PEP seem to me like the right idea to do this. We build these distro/version wheels using a modified wheel and the aforementioned docker-based build system, called Starforge. Here's where all of it lives: Galaxy: https://github.com/galaxyproject/galaxy Starforge: https://github.com/galaxyproject/starforge "Galaxy pip": https://github.com/natefoo/pip/tree/linux-wheels "Galaxy wheel": https://bitbucket.org/natefoo/wheel I say all of this (again) because I think there's still a place for the work we've done even if manylinux1 is accepted. Both manylinux1 *and* more specific distro/version platform tags can happily coexist. So I want to publicize the fact that this work is already done and in use (although I know there are some things that would need to be done before it could be upstreamed). Basically I'm trying to determine whether there's any interest in this from the pip and PyPI developers. If so, I'd be happy to write a PEP that complements the one written by Robert and Nathaniel so that both types of platform tags and wheels using them can be supported. --nate

On Thu, Jan 21, 2016 at 2:22 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Nathaniel, Robert, I'm really excited to see how quickly you're making progress. A few comments below as I haven't had a chance to catch up on the day's discussion:
On Wed, Jan 20, 2016 at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels.
Just a standards question: does this still require an update to PEP 425, or would the definition of the manylinux1 platform here supersede that section of 425?
I guess this is a pure process question, so I'll defer to Nick...
* Eventually, in the future, there may exist distributions that break compatibility with this profile
To handle the third case, we propose the creation of a file ``/etc/python/compatibility.cfg`` in ConfigParser format, with sample contents: ::
[manylinux1] compatible = true
where the supported values for the ``manylinux1.compatible`` entry are the same as those supported by the ConfigParser ``getboolean`` method.
Could this instead use the more powerful json-based syntax proposed by Nick here:
https://mail.python.org/pipermail/distutils-sig/2015-July/026617.html
I have already implemented support for this in pip and wheel.
Totally happy to change the compatibility.cfg stuff -- the version in the PEP was written in about 5 minutes in hopes of sparking discussion :-). Some other questions: 1) is this going to work for multi-arch (binaries for multiple cpu architectures sharing a single /etc)? Multiple interpreters? I guess the advantage of Nick's design is that it's scoped by the value of distutils.util.get_platform(), so multi-arch installs could have different values -- a distro could declare that their x86-64 python builds are manylinux1 compatible but their i386 python builds aren't. Maybe it would be even better if the files were /etc/python/binary-compatibility/linux_x86_64.cfg etc., so that the different .cfg files could be shipped in each architecture's package without colliding. OTOH I don't know if any of this is very useful in practice. 2) in theory one could imaging overriding this on a per-venv, per-user, or per-system level; which of these are useful to support? (Per-system seems like the most obvious, since an important use case will be distros setting this flag on behalf of their users.) There is one feature that I do think is important in the PEP 513 draft, and that Nick's suggestion from July doesn't have: in the PEP 513 design, the manylinux1 flag can be true, false, or unspecified, and this is independent of other compatibility settings; in Nick's proposal there's an exhaustive list of all compatible tags, and everything not on that list is assumed to be incompatible. Where this matters is when we want to release manylinux2. At this point we'll want pip to use some autodetection logic on old distros that were released before manylinux2, while respecting the compatibility flag on newer distros that do know about manylinux2. This requires a tri-state setting with "not specified" as a valid value. [...]
Rejected Alternatives =====================
One alternative would be to provide separate platform tags for each Linux distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``, ``debian_jessie``, etc. Nothing in this proposal rules out the possibility of adding such platform tags in the future, or of further extensions to wheel metadata that would allow wheels to declare dependencies on external system-installed packages. However, such extensions would require substantially more work than this proposal, and still might not be appreciated by package developers who would prefer not to have to maintain multiple build environments and build multiple wheels in order to cover all the common Linux distributions. Therefore we consider such proposals to be out-of-scope for this PEP.
;)
For anyone who's interested, the next release of Galaxy (a popular bioinformatics framework for running tools, workflows, etc.), due next week, will ship with our modified pip that includes support for distro/version-specific platform tags in wheels. All but one of our dependent package's wheels are built with the generic `linux_x86_64` tag on Debian Squeeze and will work with most distros, though, so we're basically doing a "loose" version of manylinux1. Only our psycopg2 wheels are built per-distro/version. I'm happy to see a more rigid definition for what we're doing with the "generic" ones, this is certainly necessary should support for generic Linux wheels ever be allowed into PyPI. manylinux1 and this PEP seem to me like the right idea to do this.
We build these distro/version wheels using a modified wheel and the aforementioned docker-based build system, called Starforge. Here's where all of it lives:
Galaxy: https://github.com/galaxyproject/galaxy Starforge: https://github.com/galaxyproject/starforge "Galaxy pip": https://github.com/natefoo/pip/tree/linux-wheels "Galaxy wheel": https://bitbucket.org/natefoo/wheel
I say all of this (again) because I think there's still a place for the work we've done even if manylinux1 is accepted. Both manylinux1 *and* more specific distro/version platform tags can happily coexist. So I want to publicize the fact that this work is already done and in use (although I know there are some things that would need to be done before it could be upstreamed).
Basically I'm trying to determine whether there's any interest in this from the pip and PyPI developers. If so, I'd be happy to write a PEP that complements the one written by Robert and Nathaniel so that both types of platform tags and wheels using them can be supported.
I don't have anything in particular to say about this part, except that it's really cool :-) -n -- Nathaniel J. Smith -- https://vorpus.org

On 22 January 2016 at 09:31, Nathaniel Smith <njs@pobox.com> wrote:
On Thu, Jan 21, 2016 at 2:22 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Nathaniel, Robert, I'm really excited to see how quickly you're making progress. A few comments below as I haven't had a chance to catch up on the day's discussion:
On Wed, Jan 20, 2016 at 10:55 PM, Nathaniel Smith <njs@pobox.com> wrote:
Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels.
Just a standards question: does this still require an update to PEP 425, or would the definition of the manylinux1 platform here supersede that section of 425?
I guess this is a pure process question, so I'll defer to Nick...
I've finally started working on changing the change proposal process, so I'm going to say "No" :) I'll start a separate thread about that, since there are some areas where I'd be interested in people's feedback before I get to far into it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Jan 21, 2016 at 6:31 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Thu, Jan 21, 2016 at 2:22 PM, Nate Coraor <nate@bx.psu.edu> wrote:
Could this instead use the more powerful json-based syntax proposed by Nick here:
https://mail.python.org/pipermail/distutils-sig/2015-July/026617.html
I have already implemented support for this in pip and wheel.
Totally happy to change the compatibility.cfg stuff -- the version in the PEP was written in about 5 minutes in hopes of sparking discussion :-).
Some other questions: 1) is this going to work for multi-arch (binaries for multiple cpu architectures sharing a single /etc)? Multiple interpreters? I guess the advantage of Nick's design is that it's scoped by the value of distutils.util.get_platform(), so multi-arch installs could have different values -- a distro could declare that their x86-64 python builds are manylinux1 compatible but their i386 python builds aren't. Maybe it would be even better if the files were /etc/python/binary-compatibility/linux_x86_64.cfg etc., so that the different .cfg files could be shipped in each architecture's package without colliding. OTOH I don't know if any of this is very useful in practice.
I don't think the proposed syntax would have any trouble with multiarch other than that it's contained in one file and so would need to live in a single package, or be dynamically generated based on which arches you installed. If that was a problem we could support a /etc/python/binary-compatibility.d type of thing.
2) in theory one could imaging overriding this on a per-venv, per-user, or per-system level; which of these are useful to support? (Per-system seems like the most obvious, since an important use case will be distros setting this flag on behalf of their users.)
Per-venv overriding was part of the original proposal and my implementation, per-user could be useful too.
There is one feature that I do think is important in the PEP 513 draft, and that Nick's suggestion from July doesn't have: in the PEP 513 design, the manylinux1 flag can be true, false, or unspecified, and this is independent of other compatibility settings; in Nick's proposal there's an exhaustive list of all compatible tags, and everything not on that list is assumed to be incompatible. Where this matters is when we want to release manylinux2. At this point we'll want pip to use some autodetection logic on old distros that were released before manylinux2, while respecting the compatibility flag on newer distros that do know about manylinux2. This requires a tri-state setting with "not specified" as a valid value.
One solution would be `compatible` and `incompatible` keys rather than `install`? --nate

Hello, I noticed that libpython is missing from the lists of dependent libraries. Also the “manylinux” Docker image has it’s Python versions compiled with libpython static. Does this mean that we must do static linking against libpython? If so, can’t this cause problems with mixing Python objects from different patch versions at runtime? I checked Anaconda, and they are using a shared Python library. Thanks, Brad ________________________________________ From: Nathaniel Smith [njs@pobox.com] Sent: Wednesday, January 20, 2016 10:55 PM To: distutils-sig Subject: [Distutils] draft PEP: manylinux1 Hi all, Here's a first draft of a PEP for the manylinux1 platform tag mentioned earlier, posted for feedback. Really Robert McGibbon should get the main credit for this, since he wrote it, and also the docker image and the amazing auditwheel tool linked below, but he asked me to do the honors of posting it :-). BTW, if anyone wants to try this out, there are some test "manylinux1-compatible" wheels at https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired for PySide (i.e. Qt) and numpy (using openblas). They should be installable on any ordinary linux system with: pip install --no-index -f https://vorpus.org/~njs/tmp/manylinux-test-wheels/repaired $PKG (Note that this may require a reasonably up-to-date pip -- e.g. the one in Debian is too old, which confused me for a bit.) (How they were created: docker run -it quay.io/manylinux/manylinux bash; install conda because to get builds of Qt and OpenBLAS because I was too lazy to do it myself; pip wheel PySide / pip wheel numpy; auditwheel repair <the resulting wheels>, which copies in all the dependencies to make the wheels self-contained. Just proof-of-concept for now, but they seem to work.) ---- PEP: XXXX Title: A Platform Tag for Portable Linux Built Distributions Version: $Revision$ Last-Modified: $Date$ Author: Robert T. McGibbon <rmcgibbo@gmail.com>, Nathaniel J. Smith <njs@pobox.com> Status: Draft Type: Process Content-Type: text/x-rst Created: 19-Jan-2016 Post-History: 19-Jan-2016 Abstract ======== This PEP proposes the creation of a new platform tag for Python package built distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with external dependencies limited restricted to a standardized subset of the Linux kernel and core userspace ABI. It proposes that PyPI support uploading and distributing Wheels with this platform tag, and that ``pip`` support downloading and installing these packages on compatible platforms. Rationale ========= Currently, distribution of binary Python extensions for Windows and OS X is straightforward. Developers and packagers build wheels, which are assigned platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload these wheels to PyPI. Users can download and install these wheels using tools such as ``pip``. For Linux, the situation is much more delicate. In general, compiled Python extension modules built on one Linux distribution will not work on other Linux distributions, or even on the same Linux distribution with different system libraries installed. Build tools using PEP 425 platform tags [1]_ do not track information about the particular Linux distribution or installed system libraries, and instead assign all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of this ambiguity, there is no expectation that ``linux``-tagged built distributions compiled on one machine will work properly on another, and for this reason, PyPI has not permitted the uploading of wheels for Linux. It would be ideal if wheel packages could be compiled that would work on *any* linux system. But, because of the incredible diversity of Linux systems -- from PCs to Android to embedded systems with custom libcs -- this cannot be guaranteed in general. Instead, we define a standard subset of the kernel+core userspace ABI that, in practice, is compatible enough that packages conforming to this standard will work on *many* linux systems, including essentially all of the desktop and server distributions in common use. We know this because there are companies who have been distributing such widely-portable pre-compiled Python extension modules for Linux -- e.g. Enthought with Canopy [2]_ and Continuum Analytics with Anaconda [3]_. Building on the compability lessons learned from these companies, we thus define a baseline ``manylinux1`` platform tag for use by binary Python wheels, and introduce the implementation of preliminary tools to aid in the construction of these ``manylinux1`` wheels. Key Causes of Inter-Linux Binary Incompatibility ================================================ To properly define a standard that will guarantee that wheel packages meeting this specification will operate on *many* linux platforms, it is necessary to understand the root causes which often prevent portability of pre-compiled binaries on Linux. The two key causes are dependencies on shared libraries which are not present on users' systems, and dependencies on particular versions of certain core libraries like ``glibc``. External Shared Libraries ------------------------- Most desktop and server linux distributions come with a system package manager (examples include ``APT`` on Debian-based systems, ``yum`` on ``RPM``-based systems, and ``pacman`` on Arch linux) that manages, among other responsibilities, the installation of shared libraries installed to system directories such as ``/usr/lib``. Most non-trivial Python extensions will depend on one or more of these shared libraries, and thus function properly only on systems where the user has the proper libraries (and the proper versions thereof), either installed using their package manager, or installed manually by setting certain environment variables such as ``LD_LIBRARY_PATH`` to notify the runtime linker of the location of the depended-upon shared libraries. Versioning of Core Shared Libraries ----------------------------------- Even if author or maintainers of a Python extension module with to use no external shared libraries, the modules will generally have a dynamic runtime dependency on the GNU C library, ``glibc``. While it is possible, statically linking ``glibc`` is usually a bad idea because of bloat, and because certain important C functions like ``dlopen()`` cannot be called from code that statically links ``glibc``. A runtime shared library dependency on a system-provided ``glibc`` is unavoidable in practice. The maintainers of the GNU C library follow a strict symbol versioning scheme for backward compatibility. This ensures that binaries compiled against an older version of ``glibc`` can run on systems that have a newer ``glibc``. The opposite is generally not true -- binaries compiled on newer Linux distributions tend to rely upon versioned functions in glibc that are not available on older systems. This generally prevents built distributions compiled on the latest Linux distributions from being portable. The ``manylinux1`` policy ========================= For these reasons, to achieve broad portability, Python wheels * should depend only on an extremely limited set of external shared libraries; and * should depend only on ``old`` symbol versions in those external shared libraries. The ``manylinux1`` policy thus encompasses a standard for what the permitted external shared libraries a wheel may depend on, and the maximum depended-upon symbol versions therein. The permitted external shared libraries are: :: libpanelw.so.5 libncursesw.so.5 libgcc_s.so.1 libstdc++.so.6 libm.so.6 libdl.so.2 librt.so.1 libcrypt.so.1 libc.so.6 libnsl.so.1 libutil.so.1 libpthread.so.0 libX11.so.6 libXext.so.6 libXrender.so.1 libICE.so.6 libSM.so.6 libGL.so.1 libgobject-2.0.so.0 libgthread-2.0.so.0 libglib-2.0.so.0 On Debian-based systems, these libraries are provided by the packages :: libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6 libxrender1 libice6 libsm6 libgl1-mesa-glx libglib2.0-0 On RPM-based systems, these libraries are provided by the packages :: ncurses libgcc libstdc++ glibc libXext libXrender libICE libSM mesa-libGL glib2 This list was compiled by checking the external shared library dependencies of the Canopy [1]_ and Anaconda [2]_ distributions, which both include a wide array of the most popular Python modules and have been confirmed in practice to work across a wide swath of Linux systems in the wild. For dependencies on externally-provided versioned symbols in the above shared libraries, the following symbol versions are permitted: :: GLIBC <= 2.5 CXXABI <= 3.4.8 GLIBCXX <= 3.4.9 GCC <= 4.2.0 These symbol versions were determined by inspecting the latest symbol version provided in the libraries distributed with CentOS 5, a Linux distribution released in April 2007. In practice, this means that Python wheels which conform to this policy should function on almost any linux distribution released after this date. Compilation and Tooling ======================= To support the compilation of wheels meeting the ``manylinux1`` standard, we provide initial drafts of two tools. The first is a Docker image based on CentOS 5.11, which is recommended as an easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_. Compiling on a more recently-released linux distribution will generally introduce dependencies on too-new versioned symbols. The image comes with a full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as well as the latest releases of Python and pip. The second tool is a command line executable called ``auditwheel`` [5]_. First, it inspects all of the ELF files inside a wheel to check for dependencies on versioned symbols or external shared libraries, and verifies conformance with the ``manylinux1`` policy. This includes the ability to add the new platform tag to conforming wheels. In addition, ``auditwheel`` has the ability to automatically modify wheels that depend on external shared libraries by copying those shared libraries from the system into the wheel itself, and modifying the appropriate RPATH entries such that these libraries will be picked up at runtime. This accomplishes a similar result as if the libraries had been statically linked without requiring changes to the build system. Neither of these tools are necessary to build wheels which conform with the ``manylinux1`` policy. Similar results can usually be achieved by statically linking external dependencies and/or using certain inline assembly constructs to instruct the linker to prefer older symbol versions, however these tricks can be quite esoteric. Platform Detection for Installers ================================= Because the ``manylinux1`` profile is already known to work for the many thousands of users of popular commercial Python distributions, we suggest that installation tools like ``pip`` should error on the side of assuming that a system *is* compatible, unless there is specific reason to think otherwise. We know of three main sources of potential incompatibility that are likely to arise in practice: * A linux distribution that is too old (e.g. RHEL 4) * A linux distribution that does not use glibc (e.g. Alpine Linux, which is based on musl libc, or Android) * Eventually, in the future, there may exist distributions that break compatibility with this profile To handle the first two cases, we propose the following simple and reliable check: :: def have_glibc_version(major, minimum_minor): import ctypes process_namespace = ctypes.CDLL(None) try: gnu_get_libc_version = process_namespace.gnu_get_libc_version except AttributeError: # We are not linked to glibc. return False gnu_get_libc_version.restype = ctypes.c_char_p version_str = gnu_get_libc_version() # py2 / py3 compatibility: if not isinstance(version_str, str): version_str = version_str.decode("ascii") version = [int(piece) for piece in version_str.split(".")] assert len(version) == 2 if major != version[0]: return False if minimum_minor > version[1]: return False return True # CentOS 5 uses glibc 2.5. is_manylinux1_compatible = have_glibc_version(2, 5) To handle the third case, we propose the creation of a file ``/etc/python/compatibility.cfg`` in ConfigParser format, with sample contents: :: [manylinux1] compatible = true where the supported values for the ``manylinux1.compatible`` entry are the same as those supported by the ConfigParser ``getboolean`` method. The proposed logic for ``pip`` or related tools, then, is: 0) If ``distutils.util.get_platform()`` does not start with the string ``"linux"``, then assume the current system is not ``manylinux1`` compatible. 1) If ``/etc/python/compatibility.conf`` exists and contains a ``manylinux1`` key, then trust that. 2) Otherwise, if ``have_glibc_version(2, 5)`` returns true, then assume the current system can handle ``manylinux1`` wheels. 3) Otherwise, assume that the current system cannot handle ``manylinux1`` wheels. Security Implications ===================== One of the advantages of dependencies on centralized libraries in Linux is that bugfixes and security updates can be deployed system-wide, and applications which depend on on these libraries will automatically feel the effects of these patches when the underlying libraries are updated. This can be particularly important for security updates in packages communication across the network or cryptography. ``manylinux1`` wheels distributed through PyPI that bundle security-critical libraries like OpenSSL will thus assume responsibility for prompt updates in response disclosed vulnerabilities and patches. This closely parallels the security implications of the distribution of binary wheels on Windows that, because the platform lacks a system package manager, generally bundle their dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be included in the ``manylinux1`` profile. Rejected Alternatives ===================== One alternative would be to provide separate platform tags for each Linux distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``, ``debian_jessie``, etc. Nothing in this proposal rules out the possibility of adding such platform tags in the future, or of further extensions to wheel metadata that would allow wheels to declare dependencies on external system-installed packages. However, such extensions would require substantially more work than this proposal, and still might not be appreciated by package developers who would prefer not to have to maintain multiple build environments and build multiple wheels in order to cover all the common Linux distributions. Therefore we consider such proposals to be out-of-scope for this PEP. References ========== .. [1] PEP 425 -- Compatibility Tags for Built Distributions (https://www.python.org/dev/peps/pep-0425/) .. [2] Enthought Canopy Python Distribution (https://store.enthought.com/downloads/) .. [3] Continuum Analytics Anaconda Python Distribution (https://www.continuum.io/downloads) .. [4] manylinux1 docker image (https://quay.io/repository/manylinux/manylinux) .. [5] auditwheel (https://pypi.python.org/pypi/auditwheel) Copyright ========= This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig

On Jan 22, 2016 9:04 AM, "Lowekamp, Bradley (NIH/NLM/LHC) [C]" < blowekamp@mail.nih.gov> wrote:
Hello,
I noticed that libpython is missing from the lists of dependent
libraries. Also the “manylinux” Docker image has it’s Python versions compiled with libpython static.
Does this mean that we must do static linking against libpython?
This is a bug/imprecision in the PEP. Manylinux1 wheels *can* link against libpython (the appropriate version for whatever python they're targeting), and the latest version of the docker image uses a shared libpython now. -Robert
participants (21)
-
Alexander Walters
-
Antoine Pitrou
-
Chris Barker
-
Chris Barker - NOAA Federal
-
David Cournapeau
-
Donald Stufft
-
Glyph Lefkowitz
-
Leonardo Rochael Almeida
-
Lowekamp, Bradley (NIH/NLM/LHC) [C]
-
M.-A. Lemburg
-
Matthew Brett
-
Matthias Klose
-
Nate Coraor
-
Nathaniel Smith
-
Nathaniel Smith
-
Nick Coghlan
-
Olivier Grisel
-
Paul Moore
-
Randy Syring
-
Robert McGibbon
-
Robert T. McGibbon