[Distutils] PEP 517: Open questions around artifact export directories
Donald Stufft
donald at stufft.io
Thu Jun 15 11:12:12 EDT 2017
> On Jun 15, 2017, at 10:10 AM, C Anthony Risinger <c at anthonyrisinger.com> wrote:
>
> On Tue, Jun 13, 2017 at 8:53 PM, Nick Coghlan <ncoghlan at gmail.com <mailto:ncoghlan at gmail.com>> wrote:
> On 13 June 2017 at 19:44, Thomas Kluyver <thomas at kluyver.me.uk <mailto:thomas at kluyver.me.uk>> wrote:
> > On Tue, Jun 13, 2017, at 02:27 AM, Nick Coghlan wrote:
>
> > I've updated the PR to specify zip archives for build_wheel and .tar.gz
> > archives for build_sdist.
>
> +1
>
> I've added one suggestion, which is to explicitly require PAX_FORMAT
> for the sdist tarballs produced this way (that's a POSIX format
> standardised in 2001 and supported by both 2.7 and 3.x that
> specifically requires that the paths be encoded as UTF-8). While the
> standard library does still default to GNU_FORMAT in general, the
> stated rationale for doing so (it being more widely supported than
> PAX_FORMAT) was last updated more than 10 years ago, and I don't think
> it applies here.
>
> I'm not trying to open a bikeshedding opportunity here -- and I tried to ignore it, honest! -- but why are tarballs preferable to zipfiles for sdists?
>
> I looked around the 517 threads to see if it had been covered already, and all I found was that zipfiles have additional PKG-INFO expectations in existing implementations, and other honorable mentions of their features over tarballs.
>
> I've never understood the anti-affinity towards zip because the format itself seems superior in many ways, such as the ability to easily append or replace-via-append (which might actually help perf when being used as an interchange format, with a repack/prune at the end), compress individual files, and the brilliance of placing the central directory/manifest at the end, allowing it to be appended to binaries, etc. and allowing rapid indexing of files. Tarballs are a black box.
>
> Just seems a little odd/arbitrary to me that wheel is zip, python supports zip importing, sdists are often zip, and Windows is zip-central, but we'd decide to codify tar.gz. It doesn't affect me personally because I'm Linux all the way down and barely remember how to use Windows, but with all the existing zip usage, and technical superiority(?), if we are going to pick something, why not that? At that point Python is all-zip and no-tar.
>
> It's not a strong opinion really, but since the PEP does attempt to limit what's currently possible, can we add some verbiage as to why tar.gz is preferred? Or consider it with more scrutiny?
>
Basically it’s the least disruptive option, the vast bulk of sdists are using ``.tar.gz`` already, multiple downstream redistributors need to do extra work to consume a .zip rather than a .tar.gz, and the technical benefits of wheel don’t really matter much in the context of a sdist. Zip isn’t a flat out win technical wise either, for instance .tar.gz can compress smaller than a .zip file because it’s compression will act over the entire set of files and not on a per file basis.
But mostly it’s just that most sdists are .tar.gz, and most Pythons except older ones on Windows default to producing .tar.gz.
—
Donald Stufft
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170615/33f1543f/attachment.html>
More information about the Distutils-SIG
mailing list