On Jun 2, 2017, at 10:41 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 2 June 2017 at 23:42, Thomas Kluyver <thomas@kluyver.me.uk> wrote:
As was suggested at some point, I have added a build_sdist hook to my
PR, with the following details:

- A brief definition of the minimal requirements of an sdist.
 - I have limited the definition to gzipped tarballs. Zip files also
 work as sdists, but we're moving towards standardising on tarballs, so
 I think it's simplest to require that of PEP-517 compliant tools.

For the sdist case, I'd prefer to leave the actual archive creation in
the hands of the frontend as far as the plugin API is concerned. That
lets us completely duck the fact that the sdist naming scheme and
exact archive format aren't formally defined anywhere, and for pip's
local build use case, we want the unpacked tree anyway.

In a lot of ways, it's closer in spirit to the wheel metadata
generation hook than it is to the wheel building hook.

I’d prefer to leave the actual creation of the archive up to the front end in both the sdist and the wheel case. It will allow some cases of pip to avoid having to round trip through a compression -> decompression cycle and can instead just use it directly. Other cases it won’t allow that, but in those cases it doesn’t really add any more time or complexity, it just shifts the time spent compressing from the backend to the frontend.

- The build_sdist hook must be defined, but may not always work (e.g. it
may depend on a VCS)

I was going to object to this aspect, but I realised there's a clear
marker file that frontends can use to determine if they're working
with an already exported sdist tree: PKG-INFO

That means the invocation protocol for the additional hook can be:

- if PKG-INFO is present, then just copy the full contents of the
directory without invoking the backend's sdist export hook
- if PKG-INFO is *not* present, then invoke the backend's sdist export
hook to do a filtered export that at least omits any VCS bookkeeping

- The prepare_build_files hook is optional, and in its absence,
frontends can use build_sdist and extract the files to create a build
- Backends (like flit) where building an sdist has extra requirements
should define prepare_build_files.

Having two hooks still leaves us open to "VCS -> sdist -> build tree
-> wheel" and "VCS -> build tree -> wheel" giving different answers,
and that's specifically the loophole we're aiming to close by
including this in PEP 517 rather than leaving it until later.

Instead, the flow that I think makes sense is "VCS -> sdist tree [->
sdist tree -> sdist tree -> ...] -> wheel", and the above model where
the export filtering is only used when PKG-INFO doesn't exist yet will
give us that.

So my preference is that everything goes through the sdist step as I think that is most likely to provide consistent builds everywhere both from a VCS checkout and from a sdist that was released to PyPI. That being said, I am somewhat sympathetic to the idea that generating a sdist might be a slow process for reasons that are unrelated to actually building a wheel (for example, documentation might get “compiled” from some kind of source format to a man page, html docs, etc) so I think I am not against the idea of having an optional hook whose job is to just do the copying needed. The requirements would be:

* The build_sdist hook is mandatory, but may fail (as any of these commands may fail tbh) if some invariant required by the build backend isn’t satisfied.
* The copy_the_files hook is optional, if it exists it SHOULD produce a tree that when the build_wheel hook is called in it, will produce a wheel that is equivalent to one that would have been built had the build_sdist hook been called instead.
* If the copy_the_files hook is not defined, then the build frontend is free to just directory call the build_sdist command instead.

I think that represents a pretty reasonable trade off, the path of least resistance for a build backend is to just define build_sdist and build_wheel and leave the two optional hooks omitted. I suspect for a lot of pure python packages (although Thomas has said not flit) those two hooks will be fast enough that is all they’ll need to implement. However in cases they’re not we provide both the copy_the_files and the wheel_metadata hook to allow short circuiting a possibly more complex build process to provide a better UX to end users. That kinds of goes against my “good intentions don’t matter” statement from before, but I also think that practicality beats purity ;)

Donald Stufft