[Distutils] PEP 527 - Removing Un(der)used file types/extensions on PyPI

Donald Stufft donald at stufft.io
Tue Aug 23 19:59:23 EDT 2016


> On Aug 23, 2016, at 7:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
> 
> On Aug 23, 2016 12:57 PM, "Donald Stufft" <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> >
> [...]
> > However, PyPI does need
> > to do work when a file is uploaded to PyPI. For instance, it needs
> > to verify that the file being uploaded is valid, it needs to ensure
> > that it’s for the project it claims to be for, etc. To do this, PyPI
> > has to know things about the file format itself, and what it can
> > expect from it. One bug that has cropped up from time to time again
> > is people accidentally uploading a package that inside it contains
> > version say “1.0”, but when they registered it with PyPI they told
> > PyPI it was version “1.0a1” or something like that, which causes a lot
> > of the tooling to do subtly weird and broken things. PyPI should be
> > double checking the internal metadata of these files, but it can’t
> > do that unless it can expect that metadata to exist in those files
> > and it has to implement it for each file type (and then, that has to
> > be maintained).
> 
> Am I understanding correctly that PyPI needs to start peeking inside sdists but hasn't started doing that yet? If that's correct, then I just want to double check that the cost of implementing this upcoming feature has been factored into the .zip-vs-.tar.gz discussion, because code for peeking inside .tar.gz files is presumably harder to write and more expensive to run than code for peeking inside .zip files. (But maybe only negligibly harder, I haven't tried writing such code myself, and uploads are relatively rare compared to downloads.) I guess the worst case would be if it turns out pypi needs to look at multiple files inside each sdist, where .tar.gz access becomes quadratic unless you're very clever.
> 
> -n
> 
Yes, though I’m not real worried about the time it takes, uploading happens something like 700 times a day, so being a touch slower isn’t the worst thing in the world, particularly if it means that our disk space or bandwidth needs are less.

—
Donald Stufft



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20160823/2d86c8ca/attachment.html>


More information about the Distutils-SIG mailing list