[Distutils] Towards a simple and standard sdist format that isn't intertwined with distutils

Daniel Holth dholth at gmail.com
Fri Oct 2 16:12:27 CEST 2015

One way to do sdist 2.0 would be to have the package-1.0.dist-info
directory in there (most sdists contain setuptools metadata) and to have a
flag static-metadata=1 in setup.cfg asserting that setup.py [if present]
does not alter the list of dependencies.

In the old MEBS design the package could suggest a build system, but pip
would invoke a list of build plugins to inspect the directory and return
True if they were able to build the package. This would allow for ignoring
the package's suggested build system. Instead of defining a command line
interface for setup.py MEBS would define a set of methods on the build

I thought Robert Collins had a working setup-requires implementation
already? I have a worse but backwards compatible one too at

On Fri, Oct 2, 2015 at 9:42 AM Marcus Smith <qwcode at gmail.com> wrote:

> Can you clarify the relationship to PEP426 metadata?
> There's no standard for metadata in here other than what's required to run
> a build hook.
> Does that imply you would have each build tool enforce their own
> convention for where metadata is found?
> On Thu, Oct 1, 2015 at 9:53 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Hi all,
>> We realized that actually as far as we could tell, it wouldn't be that
>> hard at this point to clean up how sdists work so that it would be
>> possible to migrate away from distutils. So we wrote up a little draft
>> proposal.
>> The main question is, does this approach seem sound?
>> -n
>> ---
>> PEP: ??
>> Title: Standard interface for interacting with source trees
>>        and source distributions
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Nathaniel J. Smith <njs at pobox.com>
>>         Thomas Kluyver <takowl at gmail.com>
>> Status: Draft
>> Type: Standards-Track
>> Content-Type: text/x-rst
>> Created: 30-Sep-2015
>> Post-History:
>> Discussions-To: <distutils-sig at python.org>
>> Abstract
>> ========
>> Distutils delenda est.
>> Extended abstract
>> =================
>> While ``distutils`` / ``setuptools`` have taken us a long way, they
>> suffer from three serious problems: (a) they're missing important
>> features like autoconfiguration and usable build-time dependency
>> declaration, (b) extending them is quirky, complicated, and fragile,
>> (c) you are forced to use them anyway, because they provide the
>> standard interface for installing python packages expected by both
>> users and installation tools like ``pip``.
>> Previous efforts (e.g. distutils2 or setuptools itself) have attempted
>> to solve problems (a) and/or (b). We propose to solve (c).
>> The goal of this PEP is get distutils-sig out of the business of being
>> a gatekeeper for Python build systems. If you want to use distutils,
>> great; if you want to use something else, then the more the merrier.
>> The difficulty of interfacing with distutils means that there aren't
>> many such systems right now, but to give a sense of what we're
>> thinking about see `flit <https://github.com/takluyver/flit>`_ or
>> `bento
>> <https://cournape.github.io/Bento/>`_. Fortunately, wheels have now
>> solved many of the hard problems here -- e.g. it's no longer necessary
>> that a build system also know about every possible installation
>> configuration -- so pretty much all we really need from a build system
>> is that it have some way to spit out standard-compliant wheels.
>> We therefore propose a new, relatively minimal interface for
>> installation tools like ``pip`` to interact with package source trees
>> and source distributions.
>> Synopsis and rationale
>> ======================
>> To limit the scope of our design, we adopt several principles.
>> First, we distinguish between a *source tree* (e.g., a VCS checkout)
>> and a *source distribution* (e.g., an official snapshot release like
>> ``lxml-3.4.4.zip``).
>> There isn't a whole lot that *source trees* can be assumed to have in
>> common. About all you know is that they can -- via some more or less
>> Rube-Goldbergian process -- produce one or more binary distributions.
>> In particular, you *cannot* tell via simple static inspection:
>> - What version number will be attached to the resulting packages (e.g.
>> it might be determined programmatically by consulting VCS metadata --
>> I have here a build of numpy version "1.11.0.dev0+4a9ad17")
>> - What build- or run-time dependencies are required (e.g. these may
>> depend on arbitrarily complex configuration settings that are
>> determined via a mix of manual settings and auto-probing)
>> - Or even how many distinct binary distributions will be produced
>> (e.g. a source distribution may always produce wheel A, but only
>> produce wheel B when built on Unix-like systems).
>> Therefore, when dealing with source trees, our goal is just to provide
>> a standard UX for the core operations that are commonly performed on
>> other people's packages; anything fancier and more developer-centric
>> we leave at the discretion of individual package developers. So our
>> source trees just provide some simple hooks to let a tool like
>> ``pip``:
>> - query for build dependencies
>> - run a build, producing wheels as output
>> - set up the current source tree so that it can be placed on
>> ``sys.path`` in "develop mode"
>> and that's it. We teach users that the standard way to install a
>> package from a VCS checkout is now ``pip install .`` instead of
>> ``python setup.py install``. (This is already a good idea anyway --
>> e.g., pip can do reliable uninstall / upgrades.)
>> Next, we note that pretty much all the operations that you might want
>> to perform on a *source distribution* are also operations that you
>> might want to perform on a source tree, and via the same UX. The only
>> thing you do with source distributions that you don't do with source
>> trees is, well, distribute them. There's all kind of metadata you
>> could imagine including in a source distribution, but each piece of
>> metadata puts an increased burden on source distribution generation
>> tools, and most operations will still have to work without this
>> metadata. So we only include extra metadata in source distributions if
>> it helps solve specific problems that are unique to distribution. If
>> you want wheel-style metadata, get a wheel and look at it -- they're
>> great and getting better.
>> Therefore, our source distributions are basically just source trees +
>> a mechanism for signing.
>> Finally: we explicitly do *not* have any concept of "depending on a
>> source distribution". As in other systems like Debian, dependencies
>> are always phrased in terms of binary distributions (wheels), and when
>> a user runs something like ``pip install <package>``, then the
>> long-run plan is that <package> and all its transitive dependencies
>> should be available as wheels in a package index. But this is not yet
>> realistic, so as a transitional / backwards-compatibility measure, we
>> provide a simple mechanism for ``pip install <package>`` to handle
>> cases where <package> is provided only as a source distribution.
>> Source trees
>> ============
>> We retroactively declare the legacy source tree format involving
>> ``setup.py`` to be "version 0". We don't try to specify it further;
>> its de facto specification is encoded in the source code of
>> ``distutils``, ``setuptools``, ``pip``, and other tools.
>> A version 1-or-greater format source tree can be identified by the
>> presence of a file ``_pypackage/_pypackage.cfg``.
>> If both ``_pypackage/_pypackage.cfg`` and ``setup.py`` are present,
>> then we have a version 1+ source tree, i.e., ``setup.py`` is ignored.
>> This is necessary because we anticipate that version 1+ source trees
>> may want to contain a ``setup.py`` file for backwards compatibility,
>> e.g.::
>>     #!/usr/bin/env python
>>     import sys
>>     print("Don't call setup.py directly!")
>>     print("Use 'pip install .' instead!")
>>     print("(You might have to upgrade pip first.)")
>>     sys.exit(1)
>> In the current version of the specification, the one file
>> ``_pypackage/_pypackage.cfg`` is where pretty much all the action is
>> (though see below). The motivation for putting it into a subdirectory
>> is that:
>> - the way of all standards is that cruft accumulates over time, so
>> this way we pre-emptively have a place to put it,
>> - real-world projects often accumulate build system cruft as well, so
>> we might as well provide one obvious place to put it too.
>> Of course this then creates the possibility of collisions between
>> standard files and user files, and trying to teach arbitrary users not
>> to scatter files around willy-nilly never works, so we adopt the
>> convention that names starting with an underscore are reserved for
>> official use, and non-underscored names are available for
>> idiosyncratic use by individual projects.
>> The alternative would be to simply place the main configuration file
>> at the top-level, create the subdirectory only when specifically
>> needed (most trees won't need it), and let users worry about finding
>> their own place for their cruft. Not sure which is the best approach.
>> Plus we can have a nice bikeshed about the names in general (FIXME).
>> _pypackage.cfg
>> --------------
>> The ``_pypackage.cfg`` file contains various settings. Another good
>> bike-shed topic is which file format to use for storing these (FIXME),
>> but for purposes of this draft I'll write examples using `toml
>> <https://github.com/toml-lang/toml>`_, because you'll instantly be
>> able to understand the semantics, it has similar expressivity to JSON
>> while being more human-friendly (e.g., it supports comments and
>> multi-line strings), it's better-specified than ConfigParser, and it's
>> much simpler than YAML. Rust's package manager uses toml for similar
>> purposes.
>> Here's an example ``_pypackage/_pypackage.cfg``::
>>     # Version of the "pypackage format" that this file uses.
>>     # Optional. If not present then 1 is assumed.
>>     # All version changes indicate incompatible changes; backwards
>>     # compatible changes are indicated by just having extra stuff in
>>     # the file.
>>     version = 1
>>     [build]
>>     # An inline requirements file. Optional.
>>     # (FIXME: I guess this means we need a spec for requirements files?)
>>     requirements = """
>>         mybuildtool >= 2.1
>>         special_windows_tool ; sys_platform == "win32"
>>     """
>>     # The path to an out-of-line requirements file. Optional.
>>     requirements-file = "build-requirements.txt"
>>     # A hook that will be called to query build requirements. Optional.
>>     requirements-dynamic = "mybuildtool:get_requirements"
>>     # A hook that will be called to build wheels. Required.
>>     build-wheels = "mybuildtool:do_build"
>>     # A hook that will be called to do an in-place build (see below).
>>     # Optional.
>>     build-in-place = "mybuildtool:do_inplace_build"
>>     # The "x" namespace is reserved for third-party extensions.
>>     # To use x.foo you should own the name "foo" on pypi.
>>     [x.mybuildtool]
>>     spam = ["spam", "spam", "spam"]
>> All paths are relative to the ``_pypackage/`` directory (so e.g. the
>> build.requirements-file value above refers to a file named
>> ``_pypackage/build-requirements.txt``).
>> A *hook* is a Python object that is looked up using the same rules as
>> traditional setuptools entry_points: a dotted module name, followed by
>> a colon, followed by a dotted name that is looked up within that
>> module. *Running a hook* means: first, find or create a python
>> interpreter which is executing in the current venv, whose working
>> directory is set to the ``_pypackage/`` directory, and which has the
>> ``_pypackage/`` directory on ``sys.path``. Then, inside this
>> interpreter, look up the hook object, and call it, with arguments as
>> specified below.
>> A build command like ``pip wheel <source tree>`` performs the following
>> steps:
>> 1) Validate the ``_pypackage.cfg`` version number.
>> 2) Create an empty virtualenv / venv, that matches the environment
>> that the installer is targeting (e.g. if you want wheels for CPython
>> 3.4 on 64-bit windows, then you make a CPython 3.4 64-bit windows
>> venv).
>> 3) If the build.requirements key is present, then in this venv run the
>> equivalent of ``pip install -r <a file containing its value>``, using
>> whatever index settings are currently in effect.
>> 4) If the build.requirements-file key is present, then in this venv
>> run the equivalent of ``pip install -r <the named file>``, using
>> whatever index settings are currently in effect.
>> 5) If the build.requirements-dynamic key is present, then in this venv
>>  run the hook with no arguments, capture its stdout, and pipe it into
>> ``pip install -r -``, using whatever index settings are currently in
>> effect. If the hook raises an exception, then abort the build with an
>> error.
>>    Note: because these steps are performed in sequence, the
>> build.requirements-dynamic hook is allowed to use packages that are
>> listed in build.requirements or build.requirements-file.
>> 6) In this venv, run the build.build-wheels hook. This should be a
>> Python function which takes one argument.
>>    This argument is an arbitrary dictionary intended to contain
>> user-specified configuration, specified via some install-tool-specific
>> mechanism. The intention is that tools like ``pip`` should provide
>> some way for users to specify key/value settings that will be passed
>> in here, analogous to the legacy ``--install-option`` and
>> ``--global-option`` arguments.
>>    To make it easier for packages to transition from version 0 to
>> version 1 sdists, we suggest that ``pip`` and other tools that have
>> such existing option-setting interfaces SHOULD map them to entries in
>> this dictionary when -- e.g.::
>>        pip --global-option=a --install-option=b --install-option=c
>>    could produce a dict like::
>>        {"--global-option": ["a"], "--install-option": ["b", "c"]}
>>    The hook's return value is a list of pathnames relative to the
>> scratch directory. Each entry names a wheel file created by this
>> build.
>>    Errors are signaled by raising an exception.
>> When performing an in-place build (e.g. for ``pip install -e .``),
>> then the same steps are followed, except that instead of the
>> build.build-wheels hook, we call the build.build-in-place hook, and
>> instead of returning a list of wheel files, it returns the name of a
>> directory that should be placed onto ``sys.path`` (usually this will
>> be the source tree itself, but may not be, e.g. if a build system
>> wants to enforce a rule where the source is always kept pristine then
>> it could symlink the .py files into a build directory, place the
>> extension modules and dist-info there, and return that). This
>> directory must contain importable versions of the code in the source
>> tree, along with appropriate .dist-info directories.
>> (FIXME: in-place builds are useful but intrinsically kinda broken --
>> e.g. extensions / source / metadata can all easily get out of sync --
>> so while I think this paragraph provides a reasonable hack that
>> preserves current functionality, maybe we should defer specifying them
>> to until after we've thought through the issues more?)
>> When working with source trees, build tools like ``pip`` are
>> encouraged to cache and re-use virtualenvs for performance.
>> Other contents of _pypackage/
>> -----------------------------
>> _RECORD, _RECORD.jws, _RECORD.p7s: see below.
>> _x/<pypi name>/: reserved for use by tools (e.g.
>> _x/mybuildtool/build/, _x/pip/venv-cache/cp34-none-linux_x86_64/)
>> Source distributions
>> ====================
>> A *source distribution* is a file in a well-known archive format such
>> as zip or tar.gz, which contains a single directory, and this
>> directory is a source tree (in the sense defined in the previous
>> section).
>> The ``_pypackage/`` directory in a source distribution SHOULD also
>> contain a _RECORD file, as defined in PEP 427, and MAY also contain
>> _RECORD.jws and/or _RECORD.p7s signature files.
>> For official releases, source distributions SHOULD be named as
>> ``<package>-<version>.<ext>``, and the directory they contain SHOULD
>> be named ``<package>-<version>``, and building this source tree SHOULD
>> produce a wheel named ``<package>-<version>-<compatibility tag>.whl``
>> (though it may produce other wheels as well).
>> (FIXME: maybe we should add that if you want your sdist on PyPI then
>> you MUST include a proper _RECORD file and use the proper naming
>> convention?)
>> Integration tools like ``pip`` SHOULD take advantage of this
>> convention by applying the following heuristic: when seeking a package
>> <package>, if no appropriate wheel can be found, but an sdist named
>> <package>-<version>.<ext> is found, then:
>> 1) build the sdist
>> 2) add the resulting wheels to the package search space
>> 3) retry the original operation
>> This handles a variety of simple and complex cases -- for example, if
>> we need a package 'foo', and we find foo-1.0.zip which builds foo.whl
>> and bar.whl, and foo.whl depends on bar.whl, then everything will work
>> out. There remain other cases that are not handled, e.g. if we start
>> out searching for bar.whl we will never discover foo-1.0.zip. We take
>> the perspective that this is nonetheless sufficient for a transitional
>> heuristic, and anyone who runs into this problem should just upload
>> wheels already. If this turns out to be inadequate in practice, then
>> it will be addressed by future extensions.
>> Examples
>> ========
>> **Example 1:** While we assume that installation tools will have to
>> continue supporting version 0 sdists for the indefinite future, it's a
>> useful check to make sure that our new format can continue to support
>> packages using distutils / setuptools as their build system. We assume
>> that a future version ``pip`` will take its existing knowledge of
>> distutils internals and expose them as the appropriate hooks, and then
>> existing distutils / setuptools packages can be ported forward by
>> using the following ``_pypackage/_pypackage.cfg``::
>>     [build]
>>     requirements = """
>>       pip >= whatever
>>       wheel
>>     """
>>     # Applies monkeypatches, then does 'setup.py dist_info' and
>>     # extracts the setup_requires
>>     requirements-dynamic = "pip.pypackage_hooks:setup_requirements"
>>     # Applies monkeypatches, then does 'setup.py wheel'
>>     build-wheels = "pip.pypackage_hooks:build_wheels"
>>     # Applies monkeypatches, then does:
>>     #    setup.py dist_info && setup.py build_ext -i
>>     build-in-place = "pip.pypackage_hooks:build_in_place"
>> This is also useful for any other installation tools that may want to
>> support version 0 sdists without having to implement bug-for-bug
>> compatibility with pip -- if no ``_pypackage/_pypackage.cfg`` is
>> present, they can use this as a default.
>> **Example 2:** For packages using numpy.distutils. This is identical
>> to the distutils / setuptools example above, except that numpy is
>> moved into the list of static build requirements. Right now, most
>> projects using numpy.distutils don't bother trying to declare this
>> dependency, and instead simply error out if numpy is not already
>> installed. This is because currently the only way to declare a build
>> dependency is via the ``setup_requires`` argument to the ``setup``
>> function, and in this case the ``setup`` function is
>> ``numpy.distutils.setup``, which... obviously doesn't work very well.
>> Drop this ``_pypackage.cfg`` into an existing project like this and it
>> will become robustly pip-installable with no further changes::
>>     [build]
>>     requirements = """
>>       numpy
>>       pip >= whatever
>>       wheel
>>     """
>>     requirements-dynamic = "pip.pypackage_hooks:setup_requirements"
>>     build-wheels = "pip.pypackage_hooks:build_wheels"
>>     build-in-place = "pip.pypackage_hooks:build_in_place"
>> **Example 3:** `flit <https://github.com/takluyver/flit>`_ is a tool
>> designed to make distributing simple packages simple, but it currently
>> has no support for sdists, and for convenience includes its own
>> installation code that's redundant with that in pip. These 4 lines of
>> boilerplate make any flit-using source tree pip-installable, and lets
>> flit get out of the package installation business::
>>     [build]
>>     requirements = "flit"
>>     build-wheels = "flit.pypackage_hooks:build_wheels"
>>     build-in-place = "flit.pypackage_hooks:build_in_place"
>> FAQ
>> ===
>> **Why is it version 1 instead of version 2?** Because the legacy sdist
>> format is barely a format at all, and to `remind us to keep things
>> simple <
>> https://en.wikipedia.org/wiki/The_Mythical_Man-Month#The_second-system_effect
>> >`_.
>> **What about cross-compilation?** Standardizing an interface for
>> cross-compilation seems premature given how complicated the
>> configuration required can be, the lack of an existing de facto
>> standard, and the authors of this PEP's inexperience with
>> cross-compilation. This would be a great target for future extensions,
>> though. In the mean time, there's no requirement that
>> ``_pypackage/_pypackage.cfg`` contain the *only* entry points to a
>> project's build system -- packages that want to support
>> cross-compilation can still do so, they'll just need to include a
>> README explaining how to do it.
>> **PEP 426 says that the new sdist format will support automatically
>> creating policy-compliant .deb/.rpm packages. What happened to that?**
>> Step 1: enhance the wheel format as necessary so that a wheel can be
>> automatically converted into a policy-compliant .deb/.rpm package (see
>> PEP 491). Step 2: make it possible to automatically turn sdists into
>> wheels (this PEP). Step 3: we're done.
>> **What about automatically running tests?** Arguably this is another
>> thing that should be pushed off to wheel metadata instead of sdist
>> metadata: it's good practice to include tests inside your built
>> distribution so that end-users can test their install (and see above
>> re: our focus here being on stuff that end-users want to do, not
>> dedicated package developers), there are lots of packages that have to
>> be built before they can be tested anyway (e.g. because of binary
>> extensions), and in any case it's good practice to test against an
>> installed version in order to make sure your install code works
>> properly. But even if we do want this in sdist, then it's hardly
>> urgent (e.g. there is no ``pip test`` that people will miss), so we
>> defer that for a future extension to avoid blocking the core
>> functionality.
>> --
>> Nathaniel J. Smith -- http://vorpus.org
>> _______________________________________________
>> Distutils-SIG maillist  -  Distutils-SIG at python.org
>> https://mail.python.org/mailman/listinfo/distutils-sig
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20151002/b384bced/attachment-0001.html>

More information about the Distutils-SIG mailing list