On 1 July 2017 at 20:53, Nathaniel Smith <njs@pobox.com> wrote:Hi all,
I just attempted an experimental refactor/streamlining of PEP 517, to
match what I think it should look like :-). I haven't submitted it as
a PR to the PEPs repository yet since I don't know if others will
agree with the changes, but I've pasted the full text below, or you
can see the text online at:
https://github.com/njsmith/peps/blob/517-refactor-streamline/pep-0517.txt
and the diff at:
https://github.com/python/peps/compare/master...njsmith:517-refactor-streamline?expand=1
Briefly, the changes are:
- Rearrange text into (hopefully) better signposted sections with
better organization
This is definitely needed, so thanks for that.- Clarify a number of details that have come up in discussion (e.g.,
be more explicit that the hooks are run with the process working
directory set to the source tree, and why)
This is again very helpful (we also need to be clearer on which setsof dependencies must be installed prior to running the various hooks)- Drop prepare_wheel_metadata and prepare_wheel_build_files (for now);
add detailed rationale for why we might want to add them back later.
I want prepare_wheel_metadata there as a straightforward way forbackends to expose the equivalent of "setup.py egg_info". Losing thatwould be a significant regression relative to the status quo, sincepip relies on it during the dependency resolution process (and abacktracking resolver will be even more dependent on being able to dothat efficiently).
To speak to a point Nathaniel made about being able to cache built wheels, we’re also planning on caching the result of ``setup.py egg-info`` in the GSoC student’s project. The exact semantics of this remain to be nailed down (for instance, it’s silly to cache it after we’ve built the wheel, we can just use that, but if we called it then later decided not to use it, then caching is beneficial). I don’t see any reason why we wouldn’t also cache the output of this new hook.
Adding it on later is a bit tricky because we can’t pass the existing metadata into the build_wheel hook then, so they can ensure that they generate the same output. The frontend could still of course validate that after the fact (which pip likely will) but that removes the chance for the project to ensure they work in that case rather than having that case be a failure. OTOH I don’t know how likely it is that a project actually does something different based on the metadata directory that was passed in other than fail if it would do something different.
At the end of the day I think we can add this either now or later and I have a slight preference to now and I think either way we’ll end up adding it, but I don’t really care if we add it now or later.
For the build preparation hook, Thomas already rejected promising that
build_sdist will always work in flit when given only an unpacked
sdist, while Donald and Paul rejected relying on full tree copies as a
fallback for build_sdist failing.
Since I'm not interested in rehashing that debate, the build
preparation hook also stays.
From my end, I’m happy either with the hook or with the hook Nathaniel has spec’d. I think the primary difference is if someone takes a .tar.gz, unpacks it, and then runs ``pip install that/dir`` with Nathaniel’s hook we would fail if the build_sdist hook didn’t build a sdist (e.g. in the case of how I understand flit’s use case.
IOW we have a few possibilities of when we might call this API:
1) We have a sdist already and pip downloads it, unpacks it, and builds a wheel from it.
- Works on both proposals for setuptools and flit (since for both proposals we would just unpack to a temporary location and jump straight to wheel building).
2) We have a VCS directory or “original development source” or whatever you want to call the thing you have before a sdist that typically gets into a sdist.
- Works on both proposals for setuptools and flit (since both can go from a VCS to a sdist).
- Thomas might have said he’d be unhappy if this case goes through a real sdist… I forget the specifics of that discussion now.
- If build_sdist failed in Nathaniel’s proposal, pip would probably fail and surface that error to the user (for cases where e.g. you need git installed or something).
3) We have a directory on disk that represents an unpacked sdist (with or without modifications).
- Works on the original proposal IF the build tool implements the prepare hook or the build_sdist hook can function as the identity function.
- Fails on Nathaniel’s proposal IF the build tool doesn’t implement the build_sdist hook as the identity function (e.g. flit).
So we basically only care about this for the (3) case, and even then only for projects that can’t or won’t detect that they’re in an unpacked sdist that they’ve already created and simply archive up the entire directory without any additional logic (or maybe they add additional logic to munge the version for instance if they detect there was file modifications).
From my POV, I don’t really care if (3) fails or not for individual build tools. I don’t think (3) is a very popular path (and in both cases, we’re not covering (3) 100% of the time, we just provide a different mechanism to make it work) and from pip’s POV it’s easy to surface an error from flit that says like “Cannot build an sdist from this, try from a VCS clone” or something. We can’t make this work in 100% of situations either way so whatever.
So I don’t really care if we add it nor or we use the semantics Nathaniel proposed.
However, I'd be fine with renaming these hooks using the
"prepare_X_for_Y" scheme you suggest below:
- prepare_metadata_for_build_wheel
- prepare_input_for_build_wheel
That makes it clearer where they fit into the overall wheel building
process, as well as which particular step they implement within that
process.
Paint the bike shed whatever color you want.
- Add an "extensions" hook namespace to allow prototyping of future
extensions.
No, on grounds of YAGNI. You can't reject the two hooks we've actuallyidentified as needed (aka the matrix multiplication operators of thissituation), and then turn around and argue in favour of an overlygeneral scheme that supports arbitrary extension hooks (which would besomewhat akin to having taken the PEP 225 approach to resolving thesyntactic ambiguity between elementwise multiplication and matrixmultiplication).We're not fumbling around based on a complete absence of informationhere - we're formalising a backend build interface based on more thana decade of experience with the successes and challenges of theexisting system based on distutils, setuptools and PyPI.Besides, if frontends want to invoke arbitrary interfaces that arespecific to particular backends, we already have ways for them to dothat: Python's regular import system, and plugin management approacheslike pkg_resources entry points.
I don’t care if we have extensions or not, in either form of the proposal.
- Rename get_build_*_requires -> get_requires_for_build_* to make the
naming parallelism more obvious
This sounds like a good change to me.- Add the option to declare an operation unsupported by returning
NotImplemented
No. Supported operations for a backend must be identifiable withoutinstalling any dependencies, and without running any backend code.To tighten that requirement up even further: if the backend'scapabilities can't be accurately determined using"inspect.getattr_static", then the backend is not compliant with thespec. The build frontend/backend API is not a space where we wantpeople to try to be clever - we want them to be completely dull andboring and use the most mundane code they can possibly come up with.However, it's fine for operations to fail with an exception if thereare external dependencies that haven't been satisfied (e.g. requiringa C/C++/FORTRAN/Rust/Go/etc compiler for extension modules when thereisn't one available, or requiring some other non-Python dependencythat needs to be set up prior to requesting the sdist or wheel build).Backends are also free to raise NotImplementedError when it seemsappropriate to do so (e.g. when the project doesn't support thecurrent CPU architecture) - frontends will handle the same way theywill any other backend exception.
Agreed, I don’t like the NotImplemented thing.
- Instead of defining a default value for get_requires_for_build_*,
make it mandatory for get_require_for_build_* and build_* to appear
together; this seems simpler now that we have multiple high-level
operations defined in the same PEP, and also simplifies the
definition of the NotImplemented semantics.
I don't see the value in requiring four lines of boiler plate in everybackend definition that doesn't support dynamic dependencies: def get_requires_for_build_sdist(*args, **kwargs): return () def get_requires_for_build_wheel(*args, **kwargs): return ()That's just pointless noise when we can instead say "Leave those hooksout of the backend implementation entirely if all build dependencieswill be declared in pyproject.toml (and if you really want yourbackend work to support dynamic build dependencies, please make sureyou have a really good rationale for doing so that isn't alreadycovered by pyproject.toml and the environment marker system defined inPEP 508)”
Agreed.
- Update title to better match the scope we ended up with
I'd prefer to retain the current title, as it describes the desired*outcome* of the change, whereas the proposed revision only describesone of the technical pre-requiisites of achieving that outcome (i.e.eliminating the historical dependency on setup.py as the buildsystem's executable interface).- Add a TODO to decide how to handle backends that don't want to have
multiple hooks called from the same process, including some
discussion of the options.
I don't think that's a TODO: I'm happy with the option of restrictingfrontends to "one hook invocation per subprocess call".It only becomes an open question in this revised draft by virtue ofmaking the get_requires_* hooks mandatory, and I have a differentresolution for that: keep those hooks optional, so that only backendsthat genuinely support dynamic build time dependencies will definethem (others should either just get users to list any additionalstatic build dependencies in pyproject.toml, or else list anyalways-needed static dependencies in the backend's owninstall_requires).
Agreed.