[Distutils] Provisionally accepting PEP 517's declarative build system interface

C Anthony Risinger c at anthonyrisinger.com
Thu Jun 1 14:12:12 EDT 2017


On Thu, Jun 1, 2017 at 5:34 AM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 1, 2017, at 3:44 AM, Paul Moore <p.f.moore at gmail.com> wrote:
>
> On 1 June 2017 at 01:08, Donald Stufft <donald at stufft.io> wrote:
>
> A sdist is a .tar.gz or a .zip file with a directory structure like (along
> with whatever additional files the project needs in the sdist):
>
> [...]
>
> I'm confused. Isn't this basically what PEP 517 says already? You've
> added some details and clarification, but that could just as easily be
> done in a separate document/PEP. The details aren't needed for PEP 517
> itself.
>
>
> Yes, it’s basically what PEP 517 says already just more specific and
> detailed. I don’t know what more people want from “defining what an sdist
> is”, because that’s basically all an sdist is. I’ve always been of the
> opinion that PEP 517 is already defining (and then modifying) what an sdist
> is and I don’t know what more people would want.
>
> PEP 517 needs to do it because PEP 517 wants to change the definition of
> what a sdist is, and you can’t really change the definition without in fact
> defining the new thing. I mean we could make a new PEP that just defines
> sdist (minus the pyproject.toml part) then make PEP 517 extend that PEP and
> add the pyproject.toml… but that seems kind of silly to me? Splitting it
> out into it’s own PEP gains us nothing and to me, feels like extra process
> for process’s sake.
>

PEP 518's pyproject.toml only specifies a single table, `build-system`,
that matters. Can we just add a blurb to PEP 517 that says something to the
effect of "If the following sub table exists, its location key can be used
to pre-populate the metadata_directory of `get_wheel_metadata`
automatically":

[build-system.metadata]
directory = some_dist_info_directory/

(pulled from the spec in 517 about what get_wheel_metadata is supposed to
do)

Then we could default that directory to something obvious, like the
aforementioned ./DIST-INFO or ./.dist-info, or whatever, because isn't such
a directory expected to contain enough information to create a wheel
anyway? Like {package-name and {version} via METADATA? And typically
included in sdists already? If it has a SOURCE-RECORD file [new], then pip
and friends can use that t
o know what files are needed for the build, and can use pyproject.toml (if
it exists) for creating and/or updating it for later sdist generation. In
the simple case, every normal file in a wheel is also in an sdist,
verbatim, with no additional artifacts of any kind (pure python) and only
additional metadata. The build doesn't care if things like LICENCE are in
the tree. If there
is no static SOURCE-RECORD, pip and friends fallback to a wholesale copy
operation of the input source. The build backend's `get_wheel_metadata` (if
defined) can update or backfill missing information within the METADATA
file, and create the WHEEL file (or save that for `build_wheel`), if it
finds the `metadata.directory` seeded from the static location referenced
in pyproject.tom
l is incomplete.

In the end, the build frontend logic would look something like:

(also seems like `get_wheel_metadata` should maybe return the final
.dist-info directory it decided on, or just settle on DIST-INFO and enough
of this name-version.dist-info nonsense already... should possibly be a
required build api function with the understanding `build_wheel` might
update it)

* Is build-system.metadata.directory defined?
YES: copy to {metadata_directory}/DIST-INFO
NO: mkdir {metadata_directory}/DIST-INFO

* Does {metadata_directory}/DIST-INFO/SOURCE-RECORD exist?
YES: use that to isolate/prune/copy source tree for initial build, if
desired, and also confirm hashes, if any
NO: do nothing

(we have something that might look like an sdist, but possibly incomplete
[eg. still no METADATA])

* Is build-backend.MODULE.get_build_requires defined?
YES: make sure those things exist then
NO: do nothing

* Is build-backend.MODULE.get_wheel_metadata defined?
YES: call it like PEP 517 says, DIST-INFO is ready for updating
NO: do nothing

(we have something that might look like an sdist, but possibly incomplete
[eg. still no METADATA])

* Is build-backend.MODULE.build_wheel defined?
YES: call it like PEP 517 says, replace RECORD with the final record from
build?
NO: do nothing

* Is {metadata_directory}/DIST-INFO/* valid and the resultant whl as well?
YES: YAY! \o/
NO: BLOW UUUUUP

* Does {metadata_directory}/DIST-INFO/SOURCE-RECORD exist [must reference
pyproject.toml! too]?
YES: use that to prune files when creating a proper sdist AFTER the build
NO: sdist is original source tree + {metadata_directory}/DIST-INFO -
RECORD(?)

(we have enough information to produce an complete sdist that could be used
to generate a valid wheel again)

Because the build itself can output additional source files, that may be
desirable to include in an sdist later, I honestly don't think you can pass
through a "proper" sdist before a wheel. I think you can 99% of the time do
that, but some builds using Cython and friends could actually have a custom
initial build that generates standard .h/.c/.py, and even outputs an
alternative p
yproject.toml that *no longer needs* a custom build backend. Or just
straight deletes it from SOURCE-RECORD once the custom build is done,
because some artifacts are enough to rebuild a wheel next time. It seems to
me the only possibly correct order is:

1. VCS checkout
2. partial sdist, but still likely an sdist, no promises!
3. wheel
4. proper sdist from generated SOURCE-RECORD, or updated static
SOURCE-RECORD, or just original source tree + DIST-INFO

I don't see a way to get a 100% valid sdist without first building the
project and effectively asking the build backend (via its SOURCE-RECORD, if
any) "Well golly, you did a build! What, *from both the source tree and
build artifacts*, is important for wrapping up into a redistributable?"

Maybe I'm overlooking something yuge (I've tried to follow this discussion,
and have sort of checked out of python lately, but I'm fairly well-versed
in packing lore and code), but in general I think we really are making
sdists way way... way scarier than need be. They're pretty much whatever
the build tells you is important for redistribution, at the end, with as
much static meta
data as possible, to the point of possibly obviating their need for
pyproject.toml in the first place... maybe this aspect is what is hanging
everyone up? A redistibutable source does not need to be as flexible as the
original VCS input. An sdist is pinned to a specific version of a project,
whereas VCS represents all possible versions (albeit only one is checkout
out), and sdists
 *are not* wheels! The same expectations need not apply. Two sdists of the
same version might not be identical; one might request the custom build
backed via pyproject.toml, and the other might have already done some of
the steps for whatever reason. Authors must decide which is more
appropriate for sharing.

This ended up longer than I meant, but hopefully it's not all noise.

Thanks,

-- 

C Anthony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170601/b5778fff/attachment.html>


More information about the Distutils-SIG mailing list