New subject: Handling the binary dependency management problem - wording

30 Nov 2013

      I've had a couple of conversations recently that suggest it's high
time I posted this particular idea publicly, so here's my suggestion
for dealing with (or, more accurately, avoiding) the binary dependency
management problem for upstream distribution in the near term.

= Defining the problem =

The core packaging standards and tools need to deal with a couple of
different use cases. We want them to be usable for beginners and folks
that aren't professional developers running Python locally, but we
also want them to be suitable for professionally administered server
environments, and to integrate reasonably well with downstream
redistributors (most notably Linux distributions, but also other
redistributors like ActiveState, Enthought and Continuum Analytics).

For pure Python modules, these two objectives are easy to reconcile.

For simple, self-contained, C extension modules, it gets harder, which
is why PyPI only allows wheel files for Mac OS X and Windows, and
expects them to be compatible with the binary installers provided on
python.org. Resolving this to allow platform specific wheel files on
PyPI for Linux distributions, Cygwin, MacPorts, homebrew, etc is going
to require some work on better defining the "platform" tag in PEP 425.

For arbitrary binary dependencies, however, I contend that reconciling
the two different use cases is simply infeasible, as pip and venv have
to abide by the following two restrictions:

1. Allow low impact security updates to components low in the stack
(e.g. deploying a CPython security update shouldn't require rebuilding
any other components)
2. Allow interoperability with a variety of packaging systems,
including supporting access to Python modules provided solely through
the system package manager (e.g. yum/rpm related modules on Fedora and
derived distros)

At this level, it's often impractical to offer prebuilt binary
components with external dependencies, as you *also* need a mechanism
to deliver the external binary dependencies themselves, that platform
integrators can override if they want.

= Proposed approach =

Since supporting both cross-platform external binaries *and* allowing
the substition of platform specific binaries in the same tool is
difficult, I propose that we don't even try (at least not until we're
much further down the packaging improvement road, and have solved all
the *other* issues facing the packaging infrastructure)

Instead, I suggest that the Python Packaging User Guide *explicitly*
recommend a two level solution:

1. For simple cases without complex binary dependencies, and for cases
where platform integration and/or low impact security updates are
needed, use the core pip/virtualenv toolchain. This may sometimes mean
needing to build components with external dependencies from source,
and that's OK (it's the price we pay at this level for supporting
platform integration).

2. For cross-platform handling of external binary dependencies, we
recommend boostrapping the open source conda toolchain, and using that
to install pre-built binaries (currently administered by the Continuum
Analytics folks). Specifically, commands like the following should
work on POSIX systems without needing any local build machinery, and
without needing all the projects in the chain to publish wheels: "pip
install conda && conda init && conda install ipython"

For many end users just running things locally (especially beginners
and non-developers), using conda will be the quickest and easiest way
to get up and running. For professional developers and administrators,
option 1 will provide the finer control and platform interoperability
that they often need.

conda already has good coverage of the scientific stack, but may need
additional contributions to cover other notoriously hard to build
components (such as crypto libraries and game development libraries).

= What would this mean for the wheel format? =

conda has its own binary distribution format, using hash based
dependencies. It's this mechanism which allows it to provide reliable
cross platform binary dependency management, but it's also the same
mechanism that prevents low impact security updates and
interoperability with platform provided packages.

So wheels would remain the core binary format, and we'd continue to
recommend publishing them whenever feasible. However, we'd postpone
indefinitely the question of attempting to deal with arbitrary
external dependency support.

= Blocking issues =

Even if we agree this a good way forward in the near term, there are a
couple of technical issues that would need to be resolved before
making it official in the user guide.

There is currently at least one key blocking issue on the conda side,
where it breaks if run inside a symlinked virtual environment (which
both virtualenv and pyvenv create by default in POSIX environments):

    https://github.com/ContinuumIO/conda/issues/360

I believe conda also currently conflicts with virtualenvwrapper
regarding the meaning of activate/deactivate at the command line.

There aren't any blockers I am aware on the pip/virtualenv side, but
the fact --always-copy is currently broken in virtualenv prevents a
possible workaround for the conda issue above:

    https://github.com/pypa/virtualenv/issues/495

(pyvenv doesn't offer an --always-copy option, just the option to use
symlinks on Windows where they're not used by default due to the
associated permissions restrictions, so the primary resolution still
needs to be for conda to correctly handle that situation and convert
the venv Python to a copy rather than failing)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia

Handling the binary dependency management problem

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

Marcus Smith

tags

participants (14)