[Distutils] A possible refactor/streamlining of PEP 517

Nick Coghlan ncoghlan at gmail.com
Mon Jul 3 10:03:53 EDT 2017


On 1 July 2017 at 20:53, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> I just attempted an experimental refactor/streamlining of PEP 517, to
> match what I think it should look like :-). I haven't submitted it as
> a PR to the PEPs repository yet since I don't know if others will
> agree with the changes, but I've pasted the full text below, or you
> can see the text online at:
>
>     https://github.com/njsmith/peps/blob/517-refactor-streamline/pep-0517.txt
>
> and the diff at:
>
>     https://github.com/python/peps/compare/master...njsmith:517-refactor-streamline?expand=1
>
> Briefly, the changes are:
>
> - Rearrange text into (hopefully) better signposted sections with
>   better organization

This is definitely needed, so thanks for that.

> - Clarify a number of details that have come up in discussion (e.g.,
>   be more explicit that the hooks are run with the process working
>   directory set to the source tree, and why)

This is again very helpful (we also need to be clearer on which sets
of dependencies must be installed prior to running the various hooks)

> - Drop prepare_wheel_metadata and prepare_wheel_build_files (for now);
>   add detailed rationale for why we might want to add them back later.

I want prepare_wheel_metadata there as a straightforward way for
backends to expose the equivalent of "setup.py egg_info". Losing that
would be a significant regression relative to the status quo, since
pip relies on it during the dependency resolution process (and a
backtracking resolver will be even more dependent on being able to do
that efficiently).

For the build preparation hook, Thomas already rejected promising that
build_sdist will always work in flit when given only an unpacked
sdist, while Donald and Paul rejected relying on full tree copies as a
fallback for build_sdist failing.

Since I'm not interested in rehashing that debate, the build
preparation hook also stays.

However, I'd be fine with renaming these hooks using the
"prepare_X_for_Y" scheme you suggest below:

- prepare_metadata_for_build_wheel
- prepare_input_for_build_wheel

That makes it clearer where they fit into the overall wheel building
process, as well as which particular step they implement within that
process.

> - Add an "extensions" hook namespace to allow prototyping of future
>   extensions.

No, on grounds of YAGNI. You can't reject the two hooks we've actually
identified as needed (aka the matrix multiplication operators of this
situation), and then turn around and argue in favour of an overly
general scheme that supports arbitrary extension hooks (which would be
somewhat akin to having taken the PEP 225 approach to resolving the
syntactic ambiguity between elementwise multiplication and matrix
multiplication).

We're not fumbling around based on a complete absence of information
here - we're formalising a backend build interface based on more than
a decade of experience with the successes and challenges of the
existing system based on distutils, setuptools and PyPI.

Besides, if frontends want to invoke arbitrary interfaces that are
specific to particular backends, we already have ways for them to do
that: Python's regular import system, and plugin management approaches
like pkg_resources entry points.

> - Rename get_build_*_requires -> get_requires_for_build_* to make the
>   naming parallelism more obvious

This sounds like a good change to me.

> - Add the option to declare an operation unsupported by returning
>   NotImplemented

No. Supported operations for a backend must be identifiable without
installing any dependencies, and without running any backend code.

To tighten that requirement up even further: if the backend's
capabilities can't be accurately determined using
"inspect.getattr_static", then the backend is not compliant with the
spec. The build frontend/backend API is not a space where we want
people to try to be clever - we want them to be completely dull and
boring and use the most mundane code they can possibly come up with.

However, it's fine for operations to fail with an exception if there
are external dependencies that haven't been satisfied (e.g. requiring
a C/C++/FORTRAN/Rust/Go/etc compiler for extension modules when there
isn't one available, or requiring some other non-Python dependency
that needs to be set up prior to requesting the sdist or wheel build).
Backends are also free to raise NotImplementedError when it seems
appropriate to do so (e.g. when the project doesn't support the
current CPU architecture) - frontends will handle the same way they
will any other backend exception.

> - Instead of defining a default value for get_requires_for_build_*,
>   make it mandatory for get_require_for_build_* and build_* to appear
>   together; this seems simpler now that we have multiple high-level
>   operations defined in the same PEP, and also simplifies the
>   definition of the NotImplemented semantics.

I don't see the value in requiring four lines of boiler plate in every
backend definition that doesn't support dynamic dependencies:

    def get_requires_for_build_sdist(*args, **kwargs):
        return ()

    def get_requires_for_build_wheel(*args, **kwargs):
        return ()

That's just pointless noise when we can instead say "Leave those hooks
out of the backend implementation entirely if all build dependencies
will be declared in pyproject.toml (and if you really want your
backend work to support dynamic build dependencies, please make sure
you have a really good rationale for doing so that isn't already
covered by pyproject.toml and the environment marker system defined in
PEP 508)"

> - Update title to better match the scope we ended up with

I'd prefer to retain the current title, as it describes the desired
*outcome* of the change, whereas the proposed revision only describes
one of the technical pre-requiisites of achieving that outcome (i.e.
eliminating the historical dependency on setup.py as the build
system's executable interface).

> - Add a TODO to decide how to handle backends that don't want to have
>   multiple hooks called from the same process, including some
>   discussion of the options.

I don't think that's a TODO: I'm happy with the option of restricting
frontends to "one hook invocation per subprocess call".

It only becomes an open question in this revised draft by virtue of
making the get_requires_* hooks mandatory, and I have a different
resolution for that: keep those hooks optional, so that only backends
that genuinely support dynamic build time dependencies will define
them (others should either just get users to list any additional
static build dependencies in pyproject.toml, or else list any
always-needed static dependencies in the backend's own
install_requires).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Distutils-SIG mailing list