[Python-checkins] peps: PEP 513: Portable Built Linux Distributions

nick.coghlan python-checkins at python.org
Thu Jan 21 01:23:11 EST 2016


https://hg.python.org/peps/rev/8fd2464f7b5c
changeset:   6203:8fd2464f7b5c
user:        Nick Coghlan <ncoghlan at gmail.com>
date:        Thu Jan 21 16:23:03 2016 +1000
summary:
  PEP 513: Portable Built Linux Distributions

files:
  pep-0513.txt |  346 +++++++++++++++++++++++++++++++++++++++
  1 files changed, 346 insertions(+), 0 deletions(-)


diff --git a/pep-0513.txt b/pep-0513.txt
new file mode 100644
--- /dev/null
+++ b/pep-0513.txt
@@ -0,0 +1,346 @@
+PEP: 513
+Title: A Platform Tag for Portable Linux Built Distributions
+Version: $Revision$
+Last-Modified: $Date$
+Author: Robert T. McGibbon <rmcgibbo at gmail.com>, Nathaniel J. Smith <njs at pobox.com>
+BDFL-Delegate: Nick Coghlan <ncoghlan at gmail.com>
+Status: Draft
+Type: Informational
+Content-Type: text/x-rst
+Created: 19-Jan-2016
+Post-History: 19-Jan-2016
+
+
+Abstract
+========
+
+This PEP proposes the creation of a new platform tag for Python package built
+distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with
+external dependencies limited restricted to a standardized subset of
+the Linux kernel and core userspace ABI. It proposes that PyPI support
+uploading and distributing Wheels with this platform tag, and that ``pip``
+support downloading and installing these packages on compatible platforms.
+
+
+Rationale
+=========
+
+Currently, distribution of binary Python extensions for Windows and OS X is
+straightforward. Developers and packagers build wheels, which are assigned
+platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload these
+wheels to PyPI. Users can download and install these wheels using tools such
+as ``pip``.
+
+For Linux, the situation is much more delicate. In general, compiled Python
+extension modules built on one Linux distribution will not work on other Linux
+distributions, or even on the same Linux distribution with different system
+libraries installed.
+
+Build tools using PEP 425 platform tags [1]_ do not track information about the
+particular Linux distribution or installed system libraries, and instead assign
+all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of
+this ambiguity, there is no expectation that ``linux``-tagged built
+distributions compiled on one machine will work properly on another, and for
+this reason, PyPI has not permitted the uploading of wheels for Linux.
+
+It would be ideal if wheel packages could be compiled that would work on *any*
+linux system. But, because of the incredible diversity of Linux systems -- from
+PCs to Android to embedded systems with custom libcs -- this cannot
+be guaranteed in general.
+
+Instead, we define a standard subset of the kernel+core userspace ABI that,
+in practice, is compatible enough that packages conforming to this standard
+will work on *many* linux systems, including essentially all of the desktop
+and server distributions in common use. We know this because there are
+companies who have been distributing such widely-portable pre-compiled Python
+extension modules for Linux -- e.g. Enthought with Canopy [2]_ and Continuum
+Analytics with Anaconda [3]_.
+
+Building on the compability lessons learned from these companies, we thus
+define a baseline ``manylinux1`` platform tag for use by binary Python
+wheels, and introduce the implementation of preliminary tools to aid in the
+construction of these ``manylinux1`` wheels.
+
+
+Key Causes of Inter-Linux Binary Incompatibility
+================================================
+
+To properly define a standard that will guarantee that wheel packages meeting
+this specification will operate on *many* linux platforms, it is necessary to
+understand the root causes which often prevent portability of pre-compiled
+binaries on Linux. The two key causes are dependencies on shared libraries
+which are not present on users' systems, and dependencies on particular
+versions of certain core libraries like ``glibc``.
+
+
+External Shared Libraries
+-------------------------
+
+Most desktop and server linux distributions come with a system package manager
+(examples include ``APT`` on Debian-based systems, ``yum`` on
+``RPM``-based systems, and ``pacman`` on Arch linux) that manages, among other
+responsibilities, the installation of shared libraries installed to system
+directories such as ``/usr/lib``. Most non-trivial Python extensions will depend
+on one or more of these shared libraries, and thus function properly only on
+systems where the user has the proper libraries (and the proper
+versions thereof), either installed using their package manager, or installed
+manually by setting certain environment variables such as ``LD_LIBRARY_PATH``
+to notify the runtime linker of the location of the depended-upon shared
+libraries.
+
+
+Versioning of Core Shared Libraries
+-----------------------------------
+
+Even if author or maintainers of a Python extension module with to use no
+external shared libraries, the modules will generally have a dynamic runtime
+dependency on the GNU C library, ``glibc``. While it is possible, statically
+linking ``glibc`` is usually a bad idea because of bloat, and because certain
+important C functions like ``dlopen()`` cannot be called from code that
+statically links ``glibc``. A runtime shared library dependency on a
+system-provided ``glibc`` is unavoidable in practice.
+
+The maintainers of the GNU C library follow a strict symbol versioning scheme
+for backward compatibility. This ensures that binaries compiled against an older
+version of ``glibc`` can run on systems that have a newer ``glibc``. The
+opposite is generally not true -- binaries compiled on newer Linux
+distributions tend to rely upon versioned functions in glibc that are not
+available on older systems.
+
+This generally prevents built distributions compiled on the latest Linux
+distributions from being portable.
+
+
+The ``manylinux1`` policy
+=========================
+
+For these reasons, to achieve broad portability, Python wheels
+
+ * should depend only on an extremely limited set of external shared
+   libraries; and
+ * should depend only on ``old`` symbol versions in those external shared
+   libraries.
+
+The ``manylinux1`` policy thus encompasses a standard for what the
+permitted external shared libraries a wheel may depend on, and the maximum
+depended-upon symbol versions therein.
+
+The permitted external shared libraries are: ::
+
+    libpanelw.so.5
+    libncursesw.so.5
+    libgcc_s.so.1
+    libstdc++.so.6
+    libm.so.6
+    libdl.so.2
+    librt.so.1
+    libcrypt.so.1
+    libc.so.6
+    libnsl.so.1
+    libutil.so.1
+    libpthread.so.0
+    libX11.so.6
+    libXext.so.6
+    libXrender.so.1
+    libICE.so.6
+    libSM.so.6
+    libGL.so.1
+    libgobject-2.0.so.0
+    libgthread-2.0.so.0
+    libglib-2.0.so.0
+
+On Debian-based systems, these libraries are provided by the packages ::
+
+    libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6
+    libxrender1 libice6 libsm6 libgl1-mesa-glx libglib2.0-0
+
+On RPM-based systems, these libraries are provided by the packages ::
+
+    ncurses libgcc libstdc++ glibc libXext libXrender
+    libICE libSM mesa-libGL glib2
+
+This list was compiled by checking the external shared library dependencies of
+the Canopy [1]_ and Anaconda [2]_ distributions, which both include a wide array
+of the most popular Python modules and have been confirmed in practice to work
+across a wide swath of Linux systems in the wild.
+
+For dependencies on externally-provided versioned symbols in the above shared
+libraries, the following symbol versions are permitted: ::
+
+    GLIBC <= 2.5
+    CXXABI <= 3.4.8
+    GLIBCXX <= 3.4.9
+    GCC <= 4.2.0
+
+These symbol versions were determined by inspecting the latest symbol version
+provided in the libraries distributed with CentOS 5, a Linux distribution
+released in April 2007. In practice, this means that Python wheels which conform
+to this policy should function on almost any linux distribution released after
+this date.
+
+
+Compilation and Tooling
+=======================
+
+To support the compilation of wheels meeting the ``manylinux1`` standard, we
+provide initial drafts of two tools.
+
+The first is a Docker image based on CentOS 5.11, which is recommended as an
+easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_.
+Compiling on a more recently-released linux distribution will generally
+introduce dependencies on too-new versioned symbols. The image comes with a
+full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as
+well as the latest releases of Python and pip.
+
+The second tool is a command line executable called ``auditwheel`` [5]_. First,
+it inspects all of the ELF files inside a wheel to check for dependencies on
+versioned symbols or external shared libraries, and verifies conformance with
+the ``manylinux1`` policy. This includes the ability to add the new platform
+tag to conforming wheels.
+
+In addition, ``auditwheel`` has the ability to automatically modify wheels that
+depend on external shared libraries by copying those shared libraries from
+the system into the wheel itself, and modifying the appropriate RPATH entries
+such that these libraries will be picked up at runtime. This accomplishes a
+similar result as if the libraries had been statically linked without requiring
+changes to the build system.
+
+Neither of these tools are necessary to build wheels which conform with the
+``manylinux1`` policy. Similar results can usually be achieved by statically
+linking external dependencies and/or using certain inline assembly constructs
+to instruct the linker to prefer older symbol versions, however these tricks
+can be quite esoteric.
+
+
+Platform Detection for Installers
+=================================
+
+Because the ``manylinux1`` profile is already known to work for the many
+thousands of users of popular commercial Python distributions, we suggest that
+installation tools like ``pip`` should error on the side of assuming that a
+system *is* compatible, unless there is specific reason to think otherwise.
+
+We know of three main sources of potential incompatibility that are likely to
+arise in practice:
+
+* A linux distribution that is too old (e.g. RHEL 4)
+* A linux distribution that does not use glibc (e.g. Alpine Linux, which is
+  based on musl libc, or Android)
+* Eventually, in the future, there may exist distributions that break
+  compatibility with this profile
+
+To handle the first two cases, we propose the following simple and reliable
+check: ::
+
+    def have_glibc_version(major, minimum_minor):
+        import ctypes
+
+        process_namespace = ctypes.CDLL(None)
+        try:
+            gnu_get_libc_version = process_namespace.gnu_get_libc_version
+        except AttributeError:
+            # We are not linked to glibc.
+            return False
+
+        gnu_get_libc_version.restype = ctypes.c_char_p
+        version_str = gnu_get_libc_version()
+        # py2 / py3 compatibility:
+        if not isinstance(version_str, str):
+            version_str = version_str.decode("ascii")
+
+        version = [int(piece) for piece in version_str.split(".")]
+        assert len(version) == 2
+        if major != version[0]:
+            return False
+        if minimum_minor > version[1]:
+            return False
+        return True
+
+    # CentOS 5 uses glibc 2.5.
+    is_manylinux1_compatible = have_glibc_version(2, 5)
+
+To handle the third case, we propose the creation of a file
+``/etc/python/compatibility.cfg`` in ConfigParser format, with sample
+contents: ::
+
+   [manylinux1]
+   compatible = true
+
+where the supported values for the ``manylinux1.compatible`` entry are the
+same as those supported by the ConfigParser ``getboolean`` method.
+
+The proposed logic for ``pip`` or related tools, then, is:
+
+0) If ``distutils.util.get_platform()`` does not start with the string
+   ``"linux"``, then assume the current system is not ``manylinux1``
+   compatible.
+1) If ``/etc/python/compatibility.conf`` exists and contains a ``manylinux1``
+   key, then trust that.
+2) Otherwise, if ``have_glibc_version(2, 5)`` returns true, then assume the
+   current system can handle ``manylinux1`` wheels.
+3) Otherwise, assume that the current system cannot handle ``manylinux1``
+   wheels.
+
+
+Security Implications
+=====================
+
+One of the advantages of dependencies on centralized libraries in Linux is
+that bugfixes and security updates can be deployed system-wide, and
+applications which depend on on these libraries will automatically feel the
+effects of these patches when the underlying libraries are updated. This can
+be particularly important for security updates in packages communication
+across the network or cryptography.
+
+``manylinux1`` wheels distributed through PyPI that bundle security-critical
+libraries like OpenSSL will thus assume responsibility for prompt updates in
+response disclosed vulnerabilities and patches. This closely parallels the
+security implications of the distribution of binary wheels on Windows that,
+because the platform lacks a system package manager, generally bundle their
+dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be
+included in the ``manylinux1`` profile.
+
+
+Rejected Alternatives
+=====================
+
+One alternative would be to provide separate platform tags for each Linux
+distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``,
+``debian_jessie``, etc. Nothing in this proposal rules out the possibility of
+adding such platform tags in the future, or of further extensions to wheel
+metadata that would allow wheels to declare dependencies on external
+system-installed packages. However, such extensions would require substantially
+more work than this proposal, and still might not be appreciated by package
+developers who would prefer not to have to maintain multiple build environments
+and build multiple wheels in order to cover all the common Linux distributions.
+Therefore we consider such proposals to be out-of-scope for this PEP.
+
+
+References
+==========
+
+.. [1] PEP 425 -- Compatibility Tags for Built Distributions
+   (https://www.python.org/dev/peps/pep-0425/)
+.. [2] Enthought Canopy Python Distribution
+   (https://store.enthought.com/downloads/)
+.. [3] Continuum Analytics Anaconda Python Distribution
+   (https://www.continuum.io/downloads)
+.. [4] manylinux1 docker image
+   (https://quay.io/repository/manylinux/manylinux)
+.. [5] auditwheel
+   (https://pypi.python.org/pypi/auditwheel)
+
+Copyright
+=========
+
+This document has been placed into the public domain.
+
+..
+
+   Local Variables:
+   mode: indented-text
+   indent-tabs-mode: nil
+   sentence-end-double-space: t
+   fill-column: 70
+   coding: utf-8
+   End:

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list