On May 31, 2017, at 6:48 PM, Paul Moore <p.f.moore@gmail.com> wrote:

On 31 May 2017 at 22:13, Thomas Kluyver <thomas@kluyver.me.uk> wrote:

But if we have a hook for building something called an sdist, we need to
define what an sdist is.

OK, so can we do that?

At the moment, we have a de facto definition of a sdist - it's
something with a setup.py, some metadata defined by the metadata PEPs
(but implemented prettly loosely, so people don't actually rely on it)
and the files needed to build the project. Plus other stuff like
LICENSE that's important, but defined by the project saying it should
be included. Consumers of sdists are allowed to unpack them and run
any of the setup.py commands on them. They can in theory inspect the
metadata, but in practice don't. Call that the legacy sdist format.

What do consumers of the sdist format want to do? I don't actually
know, but my guess is that they just want to be able to install the
sdist. We presumably don't want to preserve the setup.py interface so
they need to allow for a new interface. What's wrong with "pip install
<file>"? They also want to publish the sdist to PyPI - so they need to
name it according to the current convention. Anything else?

Call this the post-PEP 517 setup.py. It's still not fully
standardised, it's an underspecified de facto standard, but "something
that follows current naming conventions and can be installed via pip
install filename" could be something that will do for now, until we
want to fully standardise sdist 2.0, with static metadata and all that
stuff. And as an additional benefit, all legacy sdists already conform
to this "spec".

I 100% agree that the current vagueness around what a sdist is, and
what tools can expect to do with them, is horribly unsatisfactory. But
to make any progress we have to discard the "exposes a setup.py
interface" rule. That's all we need for now. Longer term, we need a
formal spec. But *for now*, can we manage by replacing the setup.py
interface with an "installable by pip" interface? Or does anyone have
an alternative "good enough for now" definition of a sdist we can
agree on?

If we can do this, we can move forward. Otherwise, I fear this
discussion is going to stall with another "try to solve all the
problems at once" deadlock.


I think I’m -0 on spelling it out as “it’s whatever pip can install” rather than just codifying what the defacto rules for that is, something like:

A sdist is a .tar.gz or a .zip file with a directory structure like (along with whatever additional files the project needs in the sdist):

└── {name}-{version}
    ├── PKG-INFO
    └── setup.py OR pyproject.toml

If a sdist contains a pyproject.toml file that contains a build-system.build-backend key, then it is a PEP 517 style sdist and MUST be processed using the API as defined in PEP 517. Otherwise it is a legacy distutils/setuptools style sdist and MUST be processed by calling setup.py. PEP 517 sdists MAY contain a setup.py for compatibility with tooling that does not yet understand PEP 517.

PKG-INFO should loosely be a PEP 345 style METADATA file and the errata located at https://packaging.python.org/specifications/#package-distribution-metadata.

A sdist MUST following the {name}-{version}.{ext} naming scheme, where {ext} MUST be either .tar.gz or .zip matching the respective container/compression format being used. Both {name} and {version} MUST have any - characters escaped to a _ to match the escaping done by Wheel. Thus a sdist for a project named foo-bar with version 1.0-2 which is using a .tar.gz container for the sdist would produce a file named foo_bar-1.0_2.tar.gz.

I think this should cover the case of actually making the project pip installable (assuming of course the setup.py or build backend doesn’t do something silly like always sys.exit(1) instead of produce the expected outcome) as well as allow twine to upload a sdist produced by this (since it reads the PKG-INFO file). The delta from the defacto standard today is basically swapping the setup.py for pyproject.toml and the escaping of the filename [1]. This should not require much more of the backends than producing a wheel does, since the PKG-INFO file is essentially just the METADATA file from within a wheel (although if people are dynamically generating dependencies or something they may want to omit them rather than give misleading information in PKG-INFO).

A future PEP can solve the problem of a sdist 2.0 that has a nicer interface than that or better metadata or whatever. This just represents a fairly minimal evolution of what currently exists today to support the changes needed for PEP 517.

[1] We don’t _need_ to do this, but currently you can’t tell if foo-1-1.tar.gz is (foo-1, 1) or (foo, 1-1) and moving to mandate escaped names can try to solve that problem going into the future using the heuristic of if there is > 1 dash character in the filename, then it was not escaped and we have to fall back to the somewhat error prone context sensitive parsing of the filename. Certainly we could say that it’s out of scope for PEP 517 and leave it at that, but since it’s such a minor change I felt it wouldn’t be a big deal to add it here.

Donald Stufft