[Distutils] Towards a simple and standard sdist format that isn't intertwined with distutils

Donald Stufft donald at stufft.io
Sat Oct 3 01:04:38 CEST 2015

On October 2, 2015 at 12:54:03 AM, Nathaniel Smith (njs at pobox.com) wrote:
> Distutils delenda est.

I think that you should drop (from this PEP) the handling of a VCS/arbitrary
directories and focus solely on creating a format for source distributions. A
source distribution can (and should) be fairly strict and well defined exactly
where all of the files go, what files exist and don't exist, and things of that
nature (more on this later).

In addition, the metadata files should be optimizes for machines to read and
parse them first, for humans to read them second, and humans to write them not
at all. Given the Python standard library, your metadata inside of the source
distribution should (probably) be JSON. This is another reason why this should
focus on the source distribution as well, because the file you put into VCS
needs to be able to be written by humans.

Metadata 2.0 should probably get finished before or as part of a new sdist
format happening. I fell off towards the end of that and it appears that it
got a lot more complex since I last looked at it. It probably needs more

The filename should be strictly defined similarly to what Wheels have, probably
something like {name}-{version}.{ext}, and like wheel it should mandate that
any - characters inside of any of the fields should be escaped to a _ so that
we can unambigiously parse it. It should not support arbitrary filenames
because they are (almost never) actually sdists. In another email you mentioned
something like the tarballs that github produces, but those are not source
distributions, they are vcs exports and shouldn't be covered by this PEP.

I don't believe that Python should develop anything like the Debian ability to
have a single source "package" create multiple binary packages. The metadata of
the Wheel *must* strictly match the metadata of the sdist (except for things
that are Wheel specific). This includes things like name, version, etc. Trying
to go down this path I think will make things a lot more complicated since we
have a segmented archive where people have to claim particular names, otherwise
how do you prevent me from registering the name "foobar" on PyPI and saying it
produces the "Django" wheel?

Since I think this should only deal with source distributions, then the primary
thing we need is an operation that will take an unpacked source distribution
that is currently sitting on the filesystem and turn it into a wheel located
in a specific location.

The layout for a source distribution should be specified, I think something

    ├── meta
    │   ├── DESCRIPTION.rst
    │   ├── FORMAT-VERSION
    │   ├── LICENSE.txt
    │   └── METADATA.json
    └── src
        ├── my-cool-build-tool.cfg
        └── mypackage
            └── __init__.py

I don't particularly care about the exact names, but this layout gives us two
top level directories (and only two), one is a place where all of the source
distribution metadata goes, and one is a src directory where all of the files
for the project should go, including any relevant configuration for the build
tool in use by the project. Having two directories like this eliminates the
need to worry about naming collisions between the metadata files and the
project itself.

We should probably give this a new name instead of "sdist" and give it a
dedicated extension. Perhaps we should call them "source wheels" and have the
extension be something like .swhl or .src.whl. This means we don't need to
worry about making the same artifact compatible with both the legacy toolchain
and a toolchain that supports "source wheels".

We should also probably specify a particular container format to be used for
a .whl/.src.whl. It probably makes sense to simply use zip since that is what
wheels use and it supports different compression algorithms internally. We
probably want to at least suggest limiting compression algorithms used to
Deflate and None, if not mandate that one of those two are used.

We should include absolutely as much metadata as part of the static metadata
inside the sdist as we can. I don't think there is any case to be made for
things like name, version, summary, description, classifiers, license,
keywords, contact information (author/maintainers), project URLs, etc are
Wheel specific. I think there are other things which are arguably able to be
specified in the sdist, but I'd need to fiddle with it to be sure. Basically
any metadata that isn't included as static information will not be able to be
displayed on PyPI.

The metada should directly include the specifiers inside of it and shouldn't
propagate the meme that pip's requirements.txt format is anything but a way
to recreate a specific environment with pip.

Build requirements cannot be dynamic.

We don't need a "build in place" hook, you don't build source distributions in
place, you build wheels with them. Another PEP that handles a VCS/non sdist
directory can add things like building in place.

I don't think there's ever going to be a world where pip depends on virtualenv
or pyvenv. The PEP shouldn't prescribe how the tool installs the build deps or
executes the build hook, though I think it should mandate that it is called
with a compatible Python to the Wheel that is desired to be produced. Cross
compiling can be handled later.

Possibly we might want to make the hooks calling an executable instead of
doing something that involves importing the hook and calling it. This would
make it easier to prevent the build tool from monkeypatching the installation
tool *and* make it easier for downstream redistributors to use it.

If you're interested, I'm happy to directly collaborate on this PEP if it's in
a github repository somewhere or something. There's an interoptability repo
you can use or your own or whatever. Or you can tell me to go pound sand too
and I'll just comment on posts to the ML.

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

More information about the Distutils-SIG mailing list