Hi again --
[cc'd to Paul Dubois: you said you weren't following the distutils sig
anymore, but this directly concerns NumPy and I'd like to get your
input!]
here's that sample setup.py for NumPy. See below for discussion (and
questions!).
------------------------------------------------------------------------
#!/usr/bin/env python
# Setup script example for building the Numeric extension to Python.
# This does sucessfully compile all the .dlls. Nothing happens
# with the .py files currently.
# Move this file to the Numerical directory of the LLNL numpy
# distribution and run as:
# python numpysetup.py --verbose build_ext
#
# created 1999/08 Perry Stoll
__rcsid__ = "$Id: numpysetup.py,v 1.1 1999/09/12 20:42:48 gward Exp $"
from distutils.core import setup
setup (name = "numerical",
version = "0.01",
description = "Numerical Extension to Python",
url = "http://www.python.org/sigs/matrix-sig/",
ext_modules = [ ( '_numpy', { 'sources' : [ 'Src/_numpymodule.c',
'Src/arrayobject.c',
'Src/ufuncobject.c'
],
'include_dirs' : ['./Include'],
'def_file' : 'Src/numpy.def' }
),
( 'multiarray', { 'sources' : ['Src/multiarraymodule.c'],
'include_dirs' : ['./Include'],
'def_file': 'Src/multiarray.def'
}
),
( 'umath', { 'sources': ['Src/umathmodule.c'],
'include_dirs' : ['./Include'],
'def_file' : 'Src/umath.def' }
),
( 'fftpack', { 'sources': ['Src/fftpackmodule.c', 'Src/fftpack.c'],
'include_dirs' : ['./Include'],
'def_file' : 'Src/fftpack.def' }
),
( 'lapack_lite', { 'sources' : [ 'Src/lapack_litemodule.c',
'Src/dlapack_lite.c',
'Src/zlapack_lite.c',
'Src/blas_lite.c',
'Src/f2c_lite.c'
],
'include_dirs' : ['./Include'],
'def_file' : 'Src/lapack_lite.def' }
),
( 'ranlib', { 'sources': ['Src/ranlibmodule.c',
'Src/ranlib.c',
'Src/com.c',
'Src/linpack.c',
],
'include_dirs' : ['./Include'],
'def_file' : 'Src/ranlib.def' }
),
]
)
------------------------------------------------------------------------
First, what d'you think? Too clunky and verbose? Too much information
for each extension? I kind of think so, but I'm not sure how to reduce
it elegantly. Right now, the internal data structures needed to compile
a module are pretty obviously exposed: is this a good thing? Or should
there be some more compact form for setup.py that will be expanded later
into the full glory we see above?
I've already made one small step towards reducing the amount of cruft by
factoring 'include_dirs' out and supplying it directly as a parameter to
'setup()'. (But that needs code not in the CVS archive yet, so I've
left the sample setup.py the same for now.)
The next thing I'd like to do is get that damn "def_file" out of there.
To support it in MSVCCompiler, there's already an ugly hack that
unnecessarily affects both the UnixCCompiler and CCompiler classes, and
I want to get rid of that. (I refer to passing the 'build_info'
dictionary into the compiler classes, if you're familiar with the code
-- that dictionary is part of the Distutils extension-building system,
and should not propagate into the more general compiler classes.)
But I don't want to give these weird "def file" things standing on the
order of source files, object files, libraries, etc., because they seem
to me to be a bizarre artifact of one particular compiler, rather than
something present in a wide range of C/C++ compilers.
Based on the NumPy model, it seems like there's a not-too-kludgy way to
handle this problem. Namely:
if building extension "foo":
if file "foo.def" found in same directory as "foo.c"
add "/def:foo.def" to MSVC command line
this will of course require some platform-specific code in the build_ext
command class, but I figured that was coming eventually, so why put it
off? ;-)
To make this hack work with NumPy, one change would be necessary: rename
Src/numpy.def to Src/_numpy.def to match Src/_numpy.c, which implements
the _numpy module. Would this be too much to ask of NumPy? (Paul?)
What about other module distributions that support MSVC++ and thus ship
with "def" files? Could they be made to accomodate this scheme?
Thanks for your feedback --
Greg
--
Greg Ward - software developer gward(a)cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
Hi all --
at long last, I found the time to hack in the ability to compile
extension modules to the Distutils. Mainly, this meant adding a
'build_ext' command which uses a CCompiler instance for all its dirty
work. I also had to add a few methods to CCompiler (and, of course,
UnixCCompiler) to make this work.
And I added a new module, 'spawn', which takes care of running
sub-programs more efficiently and robustly (no shell involved) than
os.system. That's needed, obviously, so we can run the compiler!
If you're in the mood for grubbing over raw source code, then get the
latest from CVS or download a current snapshot. See
http://www.python.org/sigs/distutils-sig/implementation.html
for a link to the code snapshot.
I'm still waiting for more subclasses of CCompiler to appear. At the
very least, we're going to need MSVCCompiler to build extensions on
Windows. Any takers? Also, someone who knows the Mac, and how to run
compilers programmatically there, will have to figure out how to write a
Mac-specific concrete CCompiler subclass.
The spawn module also needs a bit of work to be portable. I suspect
that _win32_spawn() (the intended analog to my _posix_spawn()) will be
easy to implement, if it even needs to go in a separate function at all.
It looks from the Python Library documentation for 1.5.2 that the
os.spawnv() function is all we need, but it's a bit hard to figure out
just what's needed. Windows wizards, please take a look at the
'spawn()' function and see if you can make it work on Windows.
As for actually compiling extensions: well, if you can figure out the
build_ext command, go ahead and give it a whirl. It's a bit cryptic
right now, since there's no documentation and no example setup.py. (I
have a working example at home, but it's not available online.) If you
feel up to it, though, see if you can read the code and figure out
what's going on. I'm just hoping *I'll* be able to figure out what's
going on when I get back from the O'Reilly conference next week... ;-)
Enjoy --
Greg
--
Greg Ward - software developer gward(a)cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
Hi all --
at long last, I have fixed two problems that a couple people noticed a
while ago:
* I folded in Amos Latteier's NT patches almost verbatim -- just
changed an `os.path.sep == "/"' to `os.name == "posix"' and added
some comments bitching about the inadequacy of the current library
installation model (I think this is Python's fault, but for now
Distutils is slavishly aping the situation in Python 1.5.x)
* I fixed the problem whereby running "setup.py install" without
doing anything else caused a crash (because 'build' hadn't yet
been run). Now, the 'install' command automatically runs 'build'
before doing anything; to make this bearable, I added a 'have_run'
dictionary to the Distribution class to keep track of which commands
have been run. So now not only are command classes singletons,
but their 'run' method can only be invoked once -- both restrictions
enforced by Distribution.
The code is checked into CVS, or you can download a snapshot at
http://www.python.org/sigs/distutils-sig/distutils-19990607.tar.gz
Hope someone (Amos?) can try the new version under NT. Any takers for
Mac OS?
BTW, all parties involved in the Great "Where Do We Install Stuff?"
Debate should take a good, hard look at the 'set_final_options()' method
of the Install class in distutils/install.py; this is where all the
policy decisions about where to install files are made. Currently it
apes the Python 1.5 situation as closely as I could figure it out.
Obviously, this is subject to change -- I just don't know to *what* it
will change!
Greg
--
Greg Ward - software developer gward(a)cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
Hi all,
I've been aware that the distutils sig has been simmerring away, but
until recently it has not been directly relevant to what I do.
I like the look of the proposed api, but have one question. Will this
support an installed system that has multiple versions of the same
package installed simultaneously? If not, then this would seem to be a
significant limitation, especially when dependencies between packages
are considered.
Assuming it does, then how will this be achieved? I am presently
managing this with a messy arrangement of symlinks. A package is
installed with its version number in it's name, and a separate
directory is created for an application with links from the
unversioned package name to the versioned one. Then I just set the
pythonpath to this directory.
A sample of what the directory looks like is shown below.
I'm sure there is a better solution that this, and I'm not sure that
this would work under windows anyway (does windows have symlinks?).
So, has this SIG considered such versioning issues yet?
Cheers,
Tim
--------------------------------------------------------------
Tim Docker timd(a)macquarie.com.au
Quantative Applications Division
Macquarie Bank
--------------------------------------------------------------
qad16:qad $ ls -l lib/python/
total 110
drwxr-xr-x 2 mts mts 512 Nov 11 11:23 1.1
-r--r----- 1 root mts 45172 Sep 1 1998 cdrmodule_0_7_1.so
drwxr-xr-x 2 mts mts 512 Sep 1 1998 chart_1_1
drwxr-xr-x 3 mts mts 512 Sep 1 1998 Fnorb_0_7_1
dr-xr-x--- 3 mts mts 512 Nov 11 11:21 Fnorb_0_8
drwxr-xr-x 3 mts mts 1536 Mar 3 12:45 mts_1_1
dr-xr-x--- 7 mts mts 512 Nov 11 11:22 OpenGL_1_5_1
dr-xr-x--- 2 mts mts 1024 Nov 11 11:23 PIL_0_3
drwxr-xr-x 3 mts mts 512 Sep 1 1998 Pmw_0_7
dr-xr-x--- 2 mts mts 512 Nov 11 11:21 v3d_1_1
qad16:qad $ ls -l lib/python/1.1
total 30
lrwxrwxrwx 1 root other 29 Apr 10 10:43 _glumodule.so -> ../OpenGL_1_5_1/_glumodule.so
lrwxrwxrwx 1 root other 30 Apr 10 10:43 _glutmodule.so -> ../OpenGL_1_5_1/_glutmodule.so
lrwxrwxrwx 1 root other 22 Apr 10 10:43 _imaging.so -> ../PIL_0_3/_imaging.so
lrwxrwxrwx 1 root other 36 Apr 10 10:43 _opengl_nummodule.so -> ../OpenGL_1_5_1/_opengl_nummodule.so
lrwxrwxrwx 1 root other 27 Apr 10 10:43 _tkinter.so -> ../OpenGL_1_5_1/_tkinter.so
lrwxrwxrwx 1 mts mts 21 Apr 10 10:43 cdrmodule.so -> ../cdrmodule_0_7_1.so
lrwxrwxrwx 1 mts mts 12 Apr 10 10:43 chart -> ../chart_1_1
lrwxrwxrwx 1 root other 12 Apr 10 10:43 Fnorb -> ../Fnorb_0_8
lrwxrwxrwx 1 mts mts 12 Apr 10 10:43 mts -> ../mts_1_1
lrwxrwxrwx 1 root other 15 Apr 10 10:43 OpenGL -> ../OpenGL_1_5_1
lrwxrwxrwx 1 root other 33 Apr 10 10:43 opengltrmodule.so -> ../OpenGL_1_5_1/opengltrmodule.so
lrwxrwxrwx 1 root other 33 Apr 10 10:43 openglutil_num.so -> ../OpenGL_1_5_1/openglutil_num.so
lrwxrwxrwx 1 root other 10 Apr 10 10:43 PIL -> ../PIL_0_3
lrwxrwxrwx 1 mts mts 10 Apr 10 10:43 Pmw -> ../Pmw_0_7
lrwxrwxrwx 1 root other 10 Apr 10 10:43 v3d -> ../v3d_1_1
Following up on some IRC discussion with other folks:
There is precedent (Plone) for PyPI trove classifiers corresponding to
particular versions of a framework. So I'd like to get feedback on the idea
of expanding that, particularly in the case of Django.
The rationale here is that the ecosystem of Django-related packages is
quite large, but -- as I know all too well from a project I'm working on
literally at this moment -- it can be difficult to ensure that all of one's
dependencies are compatible with the version of Django one happens to be
using.
Adding trove classifier support at the level of individual versions of
Django would, I think, greatly simplify this: tools could easily analyze
which packages are compatible with an end user's chosen version, there'd be
far less manual guesswork, etc., and the rate of creation of new
classifiers would be relatively low (we tend to have one X.Y release/year
or thereabouts, and that's the level of granularity needed).
Assuming there's consensus around the idea of doing this, what would be the
correct procedure for getting such classifiers set up and maintained?
I'm not sure where the issue is, but when I specify a namespace_package in
the setup.py file, I can indeed have multiple packages with the same base
(foo.bar, foo.blah, etc...). The files all install in to the same
directory. It drops the foo/__init__.py that would be doing the
extend_path, and instead adds a ".pth" file that is a bit over my head.
The problem is that it does not seem to traverse the entire sys.path to
find multiple foo packages.
If I do not specify namespace_packages and instead just use the
pkgutil.extend_path, then this seems to allow the packages to be in
multiple places in the sys.path.
Is there something additional for the namespace_package that i need to
specify in order for all of the sys.path to be checked?
I'm using 18.5 setuptools....but I am not sure if this somehow ties in to
wheel/pip, since I'm using that for the actual install.
Hello everyone,
the talk about the sqlalchemy feature extra got me thinking
what if i could specify extras that where installed by default, but
users could opt out
a concrete example i have in mind is the default set of pytest plugins
i would like to be able to externalize legacy support details of pytest
without needing people to suddenly depend on pytest[recommended] instead
of pytest to have what they know function as is
instead i would like to be able to do something like a dependency on
pytest[-nose,-unittest,-cache] to subtract the items
the extras declaration in setup.py would just include a * sign at the
beginning
an elaborate concrete example could be
extras = {
+cache: pytest-cache
+lastfailed: pyytest[+cache], pytest-lastfailed
+looponfail: pytest[+lastfailed], pytest-looponfail
+unittest: pytest-unittest
+nose: pytest[+unittest], pytest-nose
}
also a dependency declaration using the + sign in the extras should not
imply the default extras of the package, while usage of the - should
so depending on pytest[+unittest] would only imply the unittest support
but depending on pytest[-nose] would include all positive extras except
for nose
please note in particular the dependencies on other postive extras,
those are put in,
so a negative for unittest can imply that nose can't be sensibly used as
well
if +nose did instead depend directly on pytest-unittest, then excluding
it would be a unreasonably tricky resolving algorithm with potential for
lots of mistakes
instead spelling the direct dependency on positives/negatives can
resolve inside of a package and still leave room for more outside of it
this is in particular relevant if a package with extras is depended on
twice in different packages,
because in that case each of the dependence's requirements should add up
to the combined set
there is also room for fleshing out algorithms for interacting the
positive/negative dependence sets,
but i'll leave that for later
as an addition to that later on there could be support for partial wheel
wheels, so features could be materealiyed as wheel package with a
special name
and build tools could fall back to making them from a sdist
as a example, there would be a sqlalchemy package as source wheel,
a sqlalchemy*cext package as windows wheels, and pip would have to find
a source distribution to compile the wheel package
-- Ronny
Hi all,
I think this is ready for pronouncement now -- thanks to everyone for
all their feedback over the last few weeks!
The only change relative to the last posting is that we rewrote the
section on "Platform detection for installers", to switch to letting
distributors explicitly control manylinux1 compatibility by means of a
_manylinux module.
-n
---
PEP: 513
Title: A Platform Tag for Portable Linux Built Distributions
Version: $Revision$
Last-Modified: $Date$
Author: Robert T. McGibbon <rmcgibbo(a)gmail.com>, Nathaniel J. Smith
<njs(a)pobox.com>
BDFL-Delegate: Nick Coghlan <ncoghlan(a)gmail.com>
Discussions-To: Distutils SIG <distutils-sig(a)python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 19-Jan-2016
Post-History: 19-Jan-2016, 25-Jan-2016, 29-Jan-2016
Abstract
========
This PEP proposes the creation of a new platform tag for Python package built
distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with
external dependencies limited to a standardized, restricted subset of
the Linux kernel and core userspace ABI. It proposes that PyPI support
uploading and distributing wheels with this platform tag, and that ``pip``
support downloading and installing these packages on compatible platforms.
Rationale
=========
Currently, distribution of binary Python extensions for Windows and OS X is
straightforward. Developers and packagers build wheels [1]_ [2]_, which are
assigned platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload
these wheels to PyPI. Users can download and install these wheels using tools
such as ``pip``.
For Linux, the situation is much more delicate. In general, compiled Python
extension modules built on one Linux distribution will not work on other Linux
distributions, or even on different machines running the same Linux
distribution with different system libraries installed.
Build tools using PEP 425 platform tags [3]_ do not track information about the
particular Linux distribution or installed system libraries, and instead assign
all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of
this ambiguity, there is no expectation that ``linux``-tagged built
distributions compiled on one machine will work properly on another, and for
this reason, PyPI has not permitted the uploading of wheels for Linux.
It would be ideal if wheel packages could be compiled that would work on *any*
linux system. But, because of the incredible diversity of Linux systems -- from
PCs to Android to embedded systems with custom libcs -- this cannot
be guaranteed in general.
Instead, we define a standard subset of the kernel+core userspace ABI that,
in practice, is compatible enough that packages conforming to this standard
will work on *many* linux systems, including essentially all of the desktop
and server distributions in common use. We know this because there are
companies who have been distributing such widely-portable pre-compiled Python
extension modules for Linux -- e.g. Enthought with Canopy [4]_ and Continuum
Analytics with Anaconda [5]_.
Building on the compability lessons learned from these companies, we thus
define a baseline ``manylinux1`` platform tag for use by binary Python
wheels, and introduce the implementation of preliminary tools to aid in the
construction of these ``manylinux1`` wheels.
Key Causes of Inter-Linux Binary Incompatibility
================================================
To properly define a standard that will guarantee that wheel packages meeting
this specification will operate on *many* linux platforms, it is necessary to
understand the root causes which often prevent portability of pre-compiled
binaries on Linux. The two key causes are dependencies on shared libraries
which are not present on users' systems, and dependencies on particular
versions of certain core libraries like ``glibc``.
External Shared Libraries
-------------------------
Most desktop and server linux distributions come with a system package manager
(examples include ``APT`` on Debian-based systems, ``yum`` on
``RPM``-based systems, and ``pacman`` on Arch linux) that manages, among other
responsibilities, the installation of shared libraries installed to system
directories such as ``/usr/lib``. Most non-trivial Python extensions will depend
on one or more of these shared libraries, and thus function properly only on
systems where the user has the proper libraries (and the proper
versions thereof), either installed using their package manager, or installed
manually by setting certain environment variables such as ``LD_LIBRARY_PATH``
to notify the runtime linker of the location of the depended-upon shared
libraries.
Versioning of Core Shared Libraries
-----------------------------------
Even if the developers a Python extension module wish to use no
external shared libraries, the modules will generally have a dynamic runtime
dependency on the GNU C library, ``glibc``. While it is possible, statically
linking ``glibc`` is usually a bad idea because certain important C functions
like ``dlopen()`` cannot be called from code that statically links ``glibc``. A
runtime shared library dependency on a system-provided ``glibc`` is unavoidable
in practice.
The maintainers of the GNU C library follow a strict symbol versioning scheme
for backward compatibility. This ensures that binaries compiled against an older
version of ``glibc`` can run on systems that have a newer ``glibc``. The
opposite is generally not true -- binaries compiled on newer Linux
distributions tend to rely upon versioned functions in ``glibc`` that are not
available on older systems.
This generally prevents wheels compiled on the latest Linux distributions
from being portable.
The ``manylinux1`` policy
=========================
For these reasons, to achieve broad portability, Python wheels
* should depend only on an extremely limited set of external shared
libraries; and
* should depend only on "old" symbol versions in those external shared
libraries; and
* should depend only on a widely-compatible kernel ABI.
To be eligible for the ``manylinux1`` platform tag, a Python wheel must
therefore both (a) contain binary executables and compiled code that links
*only* to libraries (other than the appropriate ``libpython`` library, which is
always a permitted dependency consistent with the PEP 425 ABI tag) with SONAMEs
included in the following list: ::
libpanelw.so.5
libncursesw.so.5
libgcc_s.so.1
libstdc++.so.6
libm.so.6
libdl.so.2
librt.so.1
libcrypt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
libGL.so.1
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0
and (b), work on a stock CentOS 5.11 [6]_ system that contains the system
package manager's provided versions of these libraries.
Because CentOS 5 is only available for x86_64 and i386 architectures,
these are the only architectures currently supported by the ``manylinux1``
policy.
On Debian-based systems, these libraries are provided by the packages ::
libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6
libxrender1 libice6 libsm6 libgl1-mesa-glx libglib2.0-0
On RPM-based systems, these libraries are provided by the packages ::
ncurses libgcc libstdc++ glibc libXext libXrender
libICE libSM mesa-libGL glib2
This list was compiled by checking the external shared library dependencies of
the Canopy [4]_ and Anaconda [5]_ distributions, which both include a wide array
of the most popular Python modules and have been confirmed in practice to work
across a wide swath of Linux systems in the wild.
Many of the permitted system libraries listed above use symbol versioning
schemes for backward compatibility. The latest symbol versions provided with
the CentOS 5.11 versions of these libraries are: ::
GLIBC_2.5
CXXABI_3.4.8
GLIBCXX_3.4.9
GCC_4.2.0
Therefore, as a consequence of requirement (b), any wheel that depends on
versioned symbols from the above shared libraries may depend only on symbols
with the following versions: ::
GLIBC <= 2.5
CXXABI <= 3.4.8
GLIBCXX <= 3.4.9
GCC <= 4.2.0
These recommendations are the outcome of the relevant discussions in January
2016 [7]_, [8]_.
Note that in our recommendations below, we do not suggest that ``pip``
or PyPI should attempt to check for and enforce the details of this
policy (just as they don't check for and enforce the details of
existing platform tags like ``win32``). The text above is provided (a)
as advice to package builders, and (b) as a method for allocating
blame if a given wheel doesn't work on some system: if it satisfies
the policy above, then this is a bug in the spec or the installation
tool; if it does not satisfy the policy above, then it's a bug in the
wheel. One useful consequence of this approach is that it leaves open
the possibility of further updates and tweaks as we gain more
experience, e.g., we could have a "manylinux 1.1" policy which targets
the same systems and uses the same ``manylinux1`` platform tag (and
thus requires no further changes to ``pip`` or PyPI), but that adjusts
the list above to remove libraries that have turned out to be
problematic or add libraries that have turned out to be safe.
Compilation of Compliant Wheels
===============================
The way glibc, libgcc, and libstdc++ manage their symbol versioning
means that in practice, the compiler toolchains that most developers
use to do their daily work are incapable of building
``manylinux1``-compliant wheels. Therefore we do not attempt to change
the default behavior of ``pip wheel`` / ``bdist_wheel``: they will
continue to generate regular ``linux_*`` platform tags, and developers
who wish to use them to generate ``manylinux1``-tagged wheels will
have to change the tag as a second post-processing step.
To support the compilation of wheels meeting the ``manylinux1`` standard, we
provide initial drafts of two tools.
Docker Image
------------
The first tool is a Docker image based on CentOS 5.11, which is recommended as
an easy to use self-contained build box for compiling ``manylinux1`` wheels
[9]_. Compiling on a more recently-released linux distribution will generally
introduce dependencies on too-new versioned symbols. The image comes with a
full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as
well as the latest releases of Python and ``pip``.
Auditwheel
----------
The second tool is a command line executable called ``auditwheel`` [10]_ that
may aid in package maintainers in dealing with third-party external
dependencies.
There are at least three methods for building wheels that use third-party
external libraries in a way that meets the above policy.
1. The third-party libraries can be statically linked.
2. The third-party shared libraries can be distributed in
separate packages on PyPI which are depended upon by the wheel.
3. The third-party shared libraries can be bundled inside the wheel
libraries, linked with a relative path.
All of these are valid option which may be effectively used by different
packages and communities. Statically linking generally requires
package-specific modifications to the build system, and distributing
third-party dependencies on PyPI may require some coordination of the
community of users of the package.
As an often-automatic alternative to these options, we introduce ``auditwheel``.
The tool inspects all of the ELF files inside a wheel to check for
dependencies on versioned symbols or external shared libraries, and verifies
conformance with the ``manylinux1`` policy. This includes the ability to add
the new platform tag to conforming wheels. More importantly, ``auditwheel`` has
the ability to automatically modify wheels that depend on external shared
libraries by copying those shared libraries from the system into the wheel
itself, and modifying the appropriate ``RPATH`` entries such that these
libraries will be picked up at runtime. This accomplishes a similar result as
if the libraries had been statically linked without requiring changes to the
build system. Packagers are advised that bundling, like static linking, may
implicate copyright concerns.
Bundled Wheels on Linux
=======================
While we acknowledge many approaches for dealing with third-party library
dependencies within ``manylinux1`` wheels, we recognize that the ``manylinux1``
policy encourages bundling external dependencies, a practice
which runs counter to the package management policies of many linux
distributions' system package managers [11]_, [12]_. The primary purpose of
this is cross-distro compatibility. Furthermore, ``manylinux1`` wheels on PyPI
occupy a different niche than the Python packages available through the
system package manager.
The decision in this PEP to encourage departure from general Linux distribution
unbundling policies is informed by the following concerns:
1. In these days of automated continuous integration and deployment
pipelines, publishing new versions and updating dependencies is easier
than it was when those policies were defined.
2. ``pip`` users remain free to use the ``"--no-binary"`` option if they want
to force local builds rather than using pre-built wheel files.
3. The popularity of modern container based deployment and "immutable
infrastructure" models involve substantial bundling at the application
layer anyway.
4. Distribution of bundled wheels through PyPI is currently the norm for
Windows and OS X.
5. This PEP doesn't rule out the idea of offering more targeted binaries for
particular Linux distributions in the future.
The model described in this PEP is most ideally suited for cross-platform
Python packages, because it means they can reuse much of the
work that they're already doing to make static Windows and OS X wheels. We
recognize that it is less optimal for Linux-specific packages that might
prefer to interact more closely with Linux's unique package management
functionality and only care about targeting a small set of particular distos.
Security Implications
---------------------
One of the advantages of dependencies on centralized libraries in Linux is
that bugfixes and security updates can be deployed system-wide, and
applications which depend on these libraries will automatically feel the
effects of these patches when the underlying libraries are updated. This can
be particularly important for security updates in packages engaged in
communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical
libraries like OpenSSL will thus assume responsibility for prompt updates in
response disclosed vulnerabilities and patches. This closely parallels the
security implications of the distribution of binary wheels on Windows that,
because the platform lacks a system package manager, generally bundle their
dependencies. In particular, because it lacks a stable ABI, OpenSSL cannot be
included in the ``manylinux1`` profile.
Platform Detection for Installers
=================================
Above, we defined what it means for a *wheel* to be
``manylinux1``-compatible. Here we discuss what it means for a *Python
installation* to be ``manylinux1``-compatible. In particular, this is
important for tools like ``pip`` to know when deciding whether or not
they should consider ``manylinux1``-tagged wheels for installation.
Because the ``manylinux1`` profile is already known to work for the
many thousands of users of popular commercial Python distributions, we
suggest that installation tools should error on the side of assuming
that a system *is* compatible, unless there is specific reason to
think otherwise.
We know of three main sources of potential incompatibility that are likely to
arise in practice:
* Eventually, in the future, there may exist distributions that break
compatibility with this profile (e.g., if one of the libraries in
the profile changes its ABI in a backwards-incompatible way)
* A linux distribution that is too old (e.g. RHEL 4)
* A linux distribution that does not use ``glibc`` (e.g. Alpine Linux, which is
based on musl ``libc``, or Android)
Therefore, we propose a two-pronged approach. To catch the first
case, we standardize a mechanism for a Python distributor to signal
that a particular Python install definitely is or is not compatible
with ``manylinux1``: this is done by installing a module named
``_manylinux``, and setting its ``manylinux1_compatible``
attribute. We do not propose adding any such module to the standard
library -- this is merely a well-known name by which distributors and
installation tools can rendezvous. However, if a distributor does add
this module, *they should add it to the standard library* rather than
to a ``site-packages/`` directory, because the standard library is
inherited by virtualenvs (which we want), and ``site-packages/`` in
general is not.
Then, to handle the latter two cases for existing Python
distributions, we suggest a simple and reliable method to check for
the presence and version of ``glibc`` (basically using it as a "clock"
for the overall age of the distribution).
Specifically, the algorithm we propose is::
def is_manylinux1_compatible():
# Only Linux, and only x86-64 / i386
from distutils.util import get_platform
if get_platform() not in ["linux_x86_64", "linux_i386"]:
return False
# Check for presence of _manylinux module
try:
import _manylinux
return bool(_manylinux.manylinux1_compatible)
except (ImportError, AttributeError):
# Fall through to heuristic check below
pass
# Check glibc version. CentOS 5 uses glibc 2.5.
return have_compatible_glibc(2, 5)
def have_compatible_glibc(major, minimum_minor):
import ctypes
process_namespace = ctypes.CDLL(None)
try:
gnu_get_libc_version = process_namespace.gnu_get_libc_version
except AttributeError:
# Symbol doesn't exist -> therefore, we are not linked to
# glibc.
return False
# Call gnu_get_libc_version, which returns a string like "2.5".
gnu_get_libc_version.restype = ctypes.c_char_p
version_str = gnu_get_libc_version()
# py2 / py3 compatibility:
if not isinstance(version_str, str):
version_str = version_str.decode("ascii")
# Parse string and check against requested version.
version = [int(piece) for piece in version_str.split(".")]
assert len(version) == 2
if major != version[0]:
return False
if minimum_minor > version[1]:
return False
return True
**Rejected alternatives:** We also considered using a configuration
file, e.g. ``/etc/python/compatibility.cfg``. The problem with this is
that a single filesystem might contain many different interpreter
environments, each with their own ABI profile -- the ``manylinux1``
compatibility of a system-installed x86_64 CPython might not tell us
much about the ``manylinux1`` compatibility of a user-installed i386
PyPy. Locating this configuration information within the Python
environment itself ensures that it remains attached to the correct
binary, and dramatically simplifies lookup code.
We also considered using a more elaborate structure, like a list of
all platform tags that should be considered compatible, together with
their preference ordering, for example: ``_binary_compat.compatible =
["manylinux1_x86_64", "centos5_x86_64", "linux_x86_64"]``. However,
this introduces several complications. For example, we want to be able
to distinguish between the state of "doesn't support ``manylinux1``"
(or eventually ``manylinux2``, etc.) versus "doesn't specify either
way whether it supports ``manylinux1``", which is not entirely obvious
in the above representation; and, it's not at all clear what features
are really needed vis a vis preference ordering given that right now
the only possible platform tags are ``manylinux1`` and ``linux``. So
we're deferring a more complete solution here for a separate PEP, when
/ if Linux gets more platform tags.
For the library compatibility check, we also considered much more
elaborate checks (e.g. checking the kernel version, searching for and
checking the versions of all the individual libraries listed in the
``manylinux1`` profile, etc.), but ultimately decided that this would
be more likely to introduce confusing bugs than actually help the
user. (For example: different distributions vary in where they
actually put these libraries, and if our checking code failed to use
the correct path search then it could easily return incorrect
answers.)
PyPI Support
============
PyPI should permit wheels containing the ``manylinux1`` platform tag to be
uploaded. PyPI should not attempt to formally verify that wheels containing
the ``manylinux1`` platform tag adhere to the ``manylinux1`` policy described
in this document. This verification tasks should be left to other tools, like
``auditwheel``, that are developed separately.
Rejected Alternatives
=====================
One alternative would be to provide separate platform tags for each Linux
distribution (and each version thereof), e.g. ``RHEL6``, ``ubuntu14_10``,
``debian_jessie``, etc. Nothing in this proposal rules out the possibility of
adding such platform tags in the future, or of further extensions to wheel
metadata that would allow wheels to declare dependencies on external
system-installed packages. However, such extensions would require substantially
more work than this proposal, and still might not be appreciated by package
developers who would prefer not to have to maintain multiple build environments
and build multiple wheels in order to cover all the common Linux distributions.
Therefore we consider such proposals to be out-of-scope for this PEP.
Future updates
==============
We anticipate that at some point in the future there will be a
``manylinux2`` specifying a more modern baseline environment (perhaps
based on CentOS 6), and someday a ``manylinux3`` and so forth, but we
defer specifying these until we have more experience with the initial
``manylinux1`` proposal.
References
==========
.. [1] PEP 0427 -- The Wheel Binary Package Format 1.0
(https://www.python.org/dev/peps/pep-0427/)
.. [2] PEP 0491 -- The Wheel Binary Package Format 1.9
(https://www.python.org/dev/peps/pep-0491/)
.. [3] PEP 425 -- Compatibility Tags for Built Distributions
(https://www.python.org/dev/peps/pep-0425/)
.. [4] Enthought Canopy Python Distribution
(https://store.enthought.com/downloads/)
.. [5] Continuum Analytics Anaconda Python Distribution
(https://www.continuum.io/downloads)
.. [6] CentOS 5.11 Release Notes
(https://wiki.centos.org/Manuals/ReleaseNotes/CentOS5.11)
.. [7] manylinux-discuss mailing list discussion
(https://groups.google.com/forum/#!topic/manylinux-discuss/-4l3rrjfr9U)
.. [8] distutils-sig discussion
(https://mail.python.org/pipermail/distutils-sig/2016-January/027997.html)
.. [9] manylinux1 docker image
(https://quay.io/repository/manylinux/manylinux)
.. [10] auditwheel tool
(https://pypi.python.org/pypi/auditwheel)
.. [11] Fedora Bundled Software Policy
(https://fedoraproject.org/wiki/Bundled_Software_policy)
.. [12] Debian Policy Manual -- 4.13: Convenience copies of code
(https://www.debian.org/doc/debian-policy/ch-source.html#s-embeddedfiles)
Copyright
=========
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
--
Nathaniel J. Smith -- https://vorpus.org