[Distutils] A possible refactor/streamlining of PEP 517

Donald Stufft donald at stufft.io
Sat Jul 15 14:33:01 EDT 2017


> On Jul 15, 2017, at 6:54 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> One particularly frustrating aspect of this discussion is that the
> worst offender for "wheel and sdist are inconsistent" is the way that
> setuptools requires developers to specify build and sdist contents
> separately (setup.py vs MANIFEST.in). That duplication is an obvious
> source of potential inconsistencies, and precisely why we get most of
> the reports we see. Ideally, new backends would not design in such
> inconsistency[1], which means it's easy to see such inconsistencies as
> "that should never happen" or "I don't understand the problem". But we
> will have to deal with the possibility of such backends, and the
> setuptools model isn't *that* unusual (setuptools didn't invent the
> file MANIFEST.in, it just reused the name for its own purpose).
> 
> [1] I don't know enough about flit to be sure, but if the developer
> forgets to check in a new source file, would it be possible for that
> source file be in the wheel but not in the sdist?


I think all of the build tools that we’ve looked at so far has this problem to some degree. It appears that flit is the least likely of the bunch to get affected by it, because it tries really hard to yell at you when you have files that aren’t in source control, but like Thomas has indicated that can obviously fail when the VCS is not available for some reason. We’re all well aware of how distutils/setuptools has issues in this arena, and enscons has it too with the fact you have two separate lists that get built, the list of files to add to the sdist, and the list of files that get installed.

Which is really the fundamental error case here. Whenever you have two different lists of files, one for the sdist and one for the install, you risk having areas where those two lists diverge which can give inconsistent results based on exactly how those two lists differ.

One thing I’d maybe push back on is the idea that a hook can’t fail— that I think is obviously not attainable. All of these hooks can fail for any number of reasons, the real question is whether it’s a fatal error to the entire build process or not.

If the wheel building hook fails, that is obviously a fatal error and a front end has to halt execution at that point because there’s nothing left for us to do (this is actually a distinct change from today, because today if wheel building fails we fall back to trying to do a direct install).

The place that we seem to be getting held up on is trying to make it so that building a sdist is a non-fatal error and that execution can continue in the case that sdist failed (or would have failed, depending on the order of operations). The primary driver for sdist errors that wouldn’t necessarily also translate to a wheel failure seems to be the lack of some external tool that can’t be installed via pip as a build requirement. Thinking through all of the tooling that currently exists, as well as any ideas in my head that I can think of for other tooling, the main tools that fit into the category of that are VCS tools (which I think is why they regularly get used as part of the example of a case where that can fail).

I wonder if maybe it would be more useful to simply recommend that instead of shelling out to random vcs binaries that these projects depend on (or bundle) libraries to directly interact with a repository. For instance, if your project supports git, then you can use dulwich or pygit2 and then the invariant of “building inside of a docker container without `git` installed” still remains functional.

This is obviously not 100% since I’m sure there are going to be some tools people want to use that simply aren’t going to be able to be installed as a Python package, however I don’t personally feel like having a fatal error because you haven’t satisfied some constraint the package has on the build system is unreasonable. That might trigger feature requests to tools to relax their constraints, but assuming that those constraints exist for good reason, then it seems easy enough to close those issues with a link to some FAQ about why they exist.

All of that being said, I don’t personally have a problem with the interface as it currently exists on https://www.python.org/dev/peps/pep-0517/ <https://www.python.org/dev/peps/pep-0517/> (assuming that’s the most up to date draft?). The inclusion of a build directory is fine with me, though the fact Nathaniel is concerned is somewhat concerning to me, given he has far more experience with random build tools than I do. I have some *other* comments about other parts of the PEP, but I’m going to hold off on addressing them until we get the interface, which is the meat of the spec, nailed down and decided.

One thing that I’ve thought about as I was reading this spec, is really I think one of the important things to do with this spec is to somewhat divorce our thinking from what specifically pip or tox or whatever will or won’t do with it as the *only* path, and instead make sure it’s flexible enough to implement all of the paths that we’re still going to support. While I had been a proponent of making VCS -> sdist -> wheel -> install be the only path, it appears I am in a minority about that (since a lot of the effort has been in trying to decide how best to support *not* going through sdist). If we’re going to support other ways, then I think being flexible is the right way to do it (as different tools will likely impose different constraints on how they process build directories).

One benefit of that is we can evolve the actual tooling faster then we can evolve specs (or at least, that seems to be the case!) and any spec we create we’re stuck with for a decade+ once it’s been implemented, but tooling itself lives for far fewer years. That means that tooling can initially start out being fairly strict or hardline, and then wait and see how the ecosystem reacts to that. We’re all making guesses about how likely one failure mode or another is going to happen with a new crop of tools designed in this decade and I don’t think we can really say for sure which cases are going to be more or less common. This is all a long winded way of saying that on the implementation side, it may make sense for pip to be strict VCS -> sdist -> wheel -> install at first, and see what issues that causes for people, and if barely anyone has any problems, well maybe great, we’re done. If there seems to be a number of folks running into issues that could/would be solved using whatever mechanism exists for going VCS -> wheel -> install, then we can start adding an option (and eventually migrating to on by default, then remove that option) to support doing that [2]. As long as the backend API is there, we can make decisions more “on the fly”.

All of that is a long winded way of saying I don’t particularly care if the VCS -> wheel -> install path is spelled out *always* doing in-place builds or if we add a build directory to specify between out of place or in place. Having a robust mechanism in place for doing that means we can adjust how things *typically* work without going back to the PEP process and throwing everything away.

Hopefully that all makes sense and is a useful sort of dumping of thoughts.

[1] One note, I noticed there’s still instances of prepare_wheel_metadata in the text.

[2] For an example, we’ve recently done with with —upgrade in order to better support projects like NumPy. The way pip works isn’t set in stone, and as we get more experience with new things we can adjust it.

—
Donald Stufft



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170715/1ec68af7/attachment-0001.html>


More information about the Distutils-SIG mailing list