[Distutils] Proposed language for how build environments work in the new build system interface

Nathaniel Smith njs at pobox.com
Mon Nov 9 00:20:56 EST 2015

Hi all,

Following the strategy of trying to break out the different
controversial parts of the new build system interface, here's some
proposed text defining the environment that a build frontend like pip
provides to a project-specific build backend.

Robert's PEP currently disclaims all of this as out-of-scope, but I
think it's good to get something down, since in practice we'll have to
figure something out before any implementations can exist. And I think
the text below pretty much hits the right points.

What might be controversial about this nonetheless is that I'm not
sure that pip *can* reasonably implement all the requirements as
written without adding a dependency on virtualenv (at least for older
pythons -- obviously this is no big deal for new pythons since venv is
now part of the stdlib). I think the requirements are correct, so...
Donald, what do you think?



The build environment

One of the responsibilities of a build frontend is to set up the
environment in which the build backend will run.

We do not require that any particular "virtual environment" mechanism
be used; a build frontend might use virtualenv, or venv, or no special
mechanism at all. But whatever mechanism is used MUST meet the
following criteria:

- All requirements specified by the project's build-requirements must
be available for import from Python.

- This must remain true even for new Python subprocesses spawned by
the build environment, e.g. code like::

    import sys, subprocess
    subprocess.check_call([sys.executable, ...])

  must spawn a Python process which has access to all the project's
build-requirements. This is necessary e.g. for build backends that
want to run legacy ``setup.py`` scripts in a subprocess.

  [TBD: the exact wording here will probably need some tweaking
depending on whether we end up using an entrypoint-like mechanism for
specifying build backend hooks (in which case we can assume that hooks
automatically have access to sys.executable), or a subprocess-based
mechanism (in which case we'll need some other way to communicate the
path to the python interpreter to the build backend, e.g. a PYTHON=
envvar). But the basic requirement is pretty much the same either

- All command-line scripts provided by the build-required packages
must be present in the build environment's PATH. For example, if a
project declares a build-requirement on `flit
<https://flit.readthedocs.org/en/latest/>`_, then the following must
work as a mechanism for running the flit command-line tool::

    import subprocess
    subprocess.check_call(["flit", ...])

A build backend MUST be prepared to function in any environment which
meets the above criteria. In particular, it MUST NOT assume that it
has access to any packages except those that are present in the
stdlib, or that are explicitly declared as build-requirements.

Recommendations for build frontends (non-normative)

A build frontend MAY use any mechanism for setting up a build
environment that meets the above criteria. For example, simply
installing all build-requirements into the global environment would be
sufficient to build any compliant package -- but this would be
sub-optimal for a number of reasons. This section contains
non-normative advice to frontend implementors.

A build frontend SHOULD, by default, create an isolated environment
for each build, containing only the standard library and any
explicitly requested build-dependencies. This has two benefits:

- It allows for a single installation run to build multiple packages
that have contradictory build-requirements. E.g. if package1
build-requires pbr==1.8.1, and package2 build-requires pbr==1.7.2,
then these cannot both be installed simultaneously into the global
environment -- which is a problem when the user requests ``pip install
package1 package2``. Or if the user already has pbr==1.8.1 installed
in their global environment, and a package build-requires pbr==1.7.2,
then downgrading the user's version would be rather rude.

- It acts as a kind of public health measure to maximize the number of
packages that actually do declare accurate build-dependencies. We can
write all the strongly worded admonitions to package authors we want,
but if build frontends don't enforce isolation by default, then we'll
inevitably end up with lots of packages on PyPI that build fine on the
original author's machine and nowhere else, which is a headache that
no-one needs.

However, there will also be situations where build-requirements are
problematic in various ways. For example, a package author might
accidentally leave off some crucial requirement despite our best
efforts; or, a package might declare a build-requirement on `foo >=
1.0` which worked great when 1.0 was the latest version, but now 1.1
is out and it has a showstopper bug; or, the user might decide to
build a package against numpy==1.7 -- overriding the package's
preferred numpy==1.8 -- to guarantee that the resulting build will be
compatible at the C ABI level with an older version of numpy (even if
this means the resulting build is unsupported upstream). Therefore,
build frontends SHOULD provide some mechanism for users to override
the above defaults. For example, a build frontend could have a
``--build-with-system-site-packages`` option that causes the
``--system-site-packages`` option to be passed to
virtualenv-or-equivalent when creating build environments, or a
``--build-requirements-override=my-requirements.txt`` option that
overrides the project's normal build-requirements.

The general principle here is that we want to enforce hygiene on
package *authors*, while still allowing *end-users* to open up the
hood and apply duct tape when necessary.

Nathaniel J. Smith -- http://vorpus.org

More information about the Distutils-SIG mailing list