[Distutils] Update to my skeletal PEP for a new build system interface

Nathaniel Smith njs at pobox.com
Mon Nov 9 00:20:10 EST 2015


Hi all,

Here's a quick update to my draft PEP for a new build system
interface, last seen here:
  https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html

There isn't terribly much here, and Robert and I should really figure
out how to reconcile what we have, but since I was rearranging some
stuff anyway and prepping possible new sections, I figured I'd at
least post this. I addressed all the previous comments so hopefully it
is boring and non-controversial :-).

Changes:

- It wasn't clear that the sdist metadata stuff was really helping, so
I took it out for now. So it doesn't get lost, I split it out as a
standalone deferred-status PEP to the pypa repository:
  https://github.com/pypa/interoperability-peps/pull/57

- Rewrote stuff to deal with Paul's comments

- Added new terminology: "build frontend" for something like pip, and
"build backend" for the project specific hooks that pip calls. Seems
helpful.

-n

----

PEP: ??
Title: A build-system independent format for source trees
Version: $Revision$
Last-Modified: $Date$
Author: Nathaniel J. Smith <njs at pobox.com>
Status: Draft
Type: Standards-Track
Content-Type: text/x-rst
Created: 30-Sep-2015
Post-History: 1 Oct 2015, 25 Oct 2015
Discussions-To: <distutils-sig at python.org>

Abstract
========

While ``distutils`` / ``setuptools`` have taken us a long way, they
suffer from three serious problems: (a) they're missing important
features like usable build-time dependency declaration,
autoconfiguration, and even basic ergonomic niceties like `DRY
<https://en.wikipedia.org/wiki/Don%27t_repeat_yourself>`_-compliant
version number management, and (b) extending them is difficult, so
while there do exist various solutions to the above problems, they're
often quirky, fragile, and expensive to maintain, and yet (c) it's
very difficult to use anything else, because distutils/setuptools
provide the standard interface for installing packages expected by
both users and installation tools like ``pip``.

Previous efforts (e.g. distutils2 or setuptools itself) have attempted
to solve problems (a) and/or (b). This proposal aims to solve (c).

The goal of this PEP is get distutils-sig out of the business of being
a gatekeeper for Python build systems. If you want to use distutils,
great; if you want to use something else, then that should be easy to
do using standardized methods. The difficulty of interfacing with
distutils means that there aren't many such systems right now, but to
give a sense of what we're thinking about see `flit
<https://github.com/takluyver/flit>`_ or `bento
<https://cournape.github.io/Bento/>`_. Fortunately, wheels have now
solved many of the hard problems here -- e.g. it's no longer necessary
that a build system also know about every possible installation
configuration -- so pretty much all we really need from a build system
is that it have some way to spit out standard-compliant wheels and
sdists.

We therefore propose a new, relatively minimal interface for
installation tools like ``pip`` to interact with package source trees
and source distributions.


Terminology and goals
=====================

A *source tree* is something like a VCS checkout. We need a standard
interface for installing from this format, to support usages like
``pip install some-directory/``.

A *source distribution* is a static snapshot representing a particular
release of some source code, like ``lxml-3.4.4.zip``. Source
distributions serve many purposes: they form an archival record of
releases, they provide a stupid-simple de facto standard for tools
that want to ingest and process large corpora of code, possibly
written in many languages (e.g. code search), they act as the input to
downstream packaging systems like Debian/Fedora/Conda/..., and so
forth. In the Python ecosystem they additionally have a particularly
important role to play, because packaging tools like ``pip`` are able
to use source distributions to fulfill binary dependencies, e.g. if
there is a distribution ``foo.whl`` which declares a dependency on
``bar``, then we need to support the case where ``pip install bar`` or
``pip install foo`` automatically locates the sdist for ``bar``,
downloads it, builds it, and installs the resulting package.

Source distributions are also known as *sdists* for short.

A *build frontend* is a tool that users might run that takes arbitrary
source trees or source distributions and builds wheels from them. The
actual building is done by each source tree's *build backend*. In a
command like ``pip wheel some-directory/``, pip is acting as a build
frontend.

An *integration frontend* is a tool that users might run that takes a
set of package requirements (e.g. a requirements.txt file) and
attempts to update a working environment to satisfy those
requirements. This may require locating, building, and installing a
combination of wheels and sdists. In a command like ``pip install
lxml==2.4.0``, pip is acting as an integration frontend.


Source trees
============

We retroactively declare the legacy source tree format involving
``setup.py`` to be "version 0". We don't try to specify it further;
its de facto specification is encoded in the source code and
documentation of ``distutils``, ``setuptools``, ``pip``, and other
tools.

A "version 1" (or greater) source tree is any directory which contains
a file named ``pypackage.cfg``, which will -- in some manner whose
details are TBD -- describe the package's build dependencies and how
to invoke the project-specific build backend. This mechanism:

- Will allow for both static and dynamic specification of build dependencies

- Will have some degree of isolation of different builds from each
other, so that it will be possible for a single run of pip to install
one package that build-depends on ``foo == 1.1`` and another package
that build-depends on ``foo == 1.2``.

- Will leave the actual installation of the package in the hands of a
specialized installation tool like pip (i.e. individual package build
systems will not need to know about things like --user versus --global
or make decisions about when and how to modify .pth files)

[TBD: the exact set of operations to be supported and their detailed semantics]

[TBD: should builds be performed in a fully isolated environment, or
should they get access to packages that are already installed in the
target install environment? The former simplifies a number of things,
but Robert was skeptical it would be possible.]

[TBD: the form of the communication channel between an installation
tool like ``pip`` and the build system, over which these operations
are requested]

[TBD: the syntactic details of the configuration file format itself.
We can change the name too if we want, I just think it's useful to
have a single name to refer to it for now, and this is the last and
least interesting thing to figure out.]


Source distributions
====================

For now, we continue with the legacy sdist format which is mostly
undefined, but basically comes down to: a file named
{NAME}-{PACKAGE}.{EXT}, which unpacks into a buildable source tree.
Traditionally these have always contained "version 0" source trees; we
now allow them to also contain version 1+ source trees.

Integration frontends require that an sdist named
{NAME}-{PACKAGE}.{EXT} will generate a wheel named
{NAME}-{PACKAGE}-{COMPAT-INFO}.whl.

[TBD: whether we want to adopt a new sdist format along with this --
my read of the room is that it's sounding like people are leaning
towards deferring that for a separate round of standardization, but
we'll see what we think once some of the important details above have
been hammered out]


Evolutionary notes
==================

A goal here is to make it as simple as possible to convert old-style
sdists to new-style sdists. (E.g., this is one motivation for
supporting dynamic build requirements.) The ideal would be that there
would be a single static pypackage.cfg that could be dropped into any
"version 0" VCS checkout to convert it to the new shiny. This is
probably not 100% possible, but we can get close, and it's important
to keep track of how close we are... hence this section.

A rough plan would be: Create a build system package
(``setuptools_pypackage`` or whatever) that knows how to speak
whatever hook language we come up with, and convert them into calls to
``setup.py``. This will probably require some sort of hooking or
monkeypatching to setuptools to provide a way to extract the
``setup_requires=`` argument when needed, and to provide a new version
of the sdist command that generates the new-style format. This all
seems doable and sufficient for a large proportion of packages (though
obviously we'll want to prototype such a system before we finalize
anything here). (Alternatively, these changes could be made to
setuptools itself rather than going into a separate package.)

But there remain two obstacles that mean we probably won't be able to
automatically upgrade packages to the new format:

1) There currently exist packages which insist on particular packages
being available in their environment before setup.py is executed. This
means that if we decide to execute build scripts in an isolated
virtualenv-like environment, then projects will need to check whether
they do this, and if so then when upgrading to the new system they
will have to start explicitly declaring these dependencies (either via
``setup_requires=`` or via static declaration in ``pypackage.cfg``).

2) There currently exist packages which do not declare consistent
metadata (e.g. ``egg_info`` and ``bdist_wheel`` might get different
``install_requires=``). When upgrading to the new system, projects
will have to evaluate whether this applies to them, and if so they
will need to stop doing that.


Copyright
=========

This document has been placed in the public domain.

-- 
Nathaniel J. Smith -- http://vorpus.org


More information about the Distutils-SIG mailing list