[Distutils] PEP 517: Open questions around artifact export directories

Donald Stufft donald at stufft.io
Sat Jun 10 12:59:13 EDT 2017


> On Jun 10, 2017, at 12:23 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> The fact this is also true for both "setup.py bdist_wheel" and for
> "enscons" would then be the strongest argument in favour of keeping
> the current "build_wheel" API: existing backends are already building
> wheels for publication directly, so it makes sense to standardise on
> what they're already doing, rather than requiring them to do extra
> work just to support PEP 517.
> 
> We also expect even "build for installation" use cases to want the
> wheel as an archive for caching purposes.


We have a few possible cases where the build-the-wheel backend is going to be called:

1) We’re creating a wheel ahead of time to publish somewhere.
2) We’re creating a wheel JIT that we’re going to cache.
3) We’re creating a wheel JIT that we’re not going to cache.

For (1) it mostly doesn’t matter if this happens in the front end or the backend, except doing it in the front end allows a small performance benefit in the case that someone is using a frontend that can be told to “build + upload this wheel”, because the frontend will need to read some files (particularly the METADATA file) to process the upload and if it has the wheel unzipped already it doesn’t need to roundtrip the METADATA file through a zip+unzip process. However this speedup is only minor and it only matters in cases where you have a singular command that does build+upload in a single go.

For (2) as the wheel cache is currently implemented it doesn’t matter if the front end does it or not. However, I could see us in the future possibly start caching unzipped wheels rather than zipped wheels. It would provide a speed up on every subsequent install since we would stop having to unzip the file over and over again on each subsequent install. I don’t *know* that we’re going to do that, it would take doing some figuring if storing the contents decompressed (which would take up more space and take up more inodes) is worth the tradeoff of removing that processing on each install. Another possible option along this line is still storing it as zip files, but using ZIP_STORED instead of ZIP_DEFLATE, so that we still keep individual .whl files, but ones that don’t store their members compressed to reduce time spent doing compression.

For (3), it allows us to skip work entirely, instead of having the backend zip up the work and then immediately unzip it again in the front end we can just not zip it up to start with. I haven’t explicitly tested it with bdist_wheel, but with the sdist command removing the step where it actually creates the .tar.gz removes the bulk of the processing time of ``python setup.py sdist`` and I would guess that is true for pure python wheels too.

Ultimately allowing the front end to do this gives us flexibility, the backend doesn’t know why we’re building the wheel so it can’t make decisions to try and optimize the workflow that it goes through, it can only do the same thing every time. However if we allow the frontend to manage that, then the frontend, which does know why we’re building things, can alter what it does with those results depending on what it expects to be done with them.

Keeping it in the backend doesn’t really buy us much of anything, except that a handful of backend authors don’t have to make relatively minor adjustments to their code base. In a vacuum I can’t see any compelling reason to have the backend do the archiving at all and the only reason I think we’re talking about it is that’s just how the backends work now— but again, changing that is almost certainly going to be extremely trivial to do.


—
Donald Stufft



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170610/d3663be2/attachment-0001.html>


More information about the Distutils-SIG mailing list