
On Sun, Oct 11, 2015 at 11:00 PM, Robert Collins <robertc@robertcollins.net> wrote:
On 12 October 2015 at 18:36, Nathaniel Smith <njs@pobox.com> wrote: [...]
the sdist name instead of the wheel name, it can actually do it
but the sdist and the wheel have to have the same name- or do you mean the filename on disk, vs the distribution name?
I mean the distribution name - there's no way to guarantee that building foo-1.0.zip won't spit out bar-7.4.whl, where by "no way" I mean "it's literally undecideable". I mean, if someone actually did this it would be super weird and we would all shun them, but our code and specs still need to be prepared for the possibility. IIUC this is why PyPI can't trust PKG-INFO: 99.9% of the time the metadata in PKG-INFO matches what you will get when you run setup.py, but right now PyPI wants to know what setup.py will do, and there's no way to know if it will be the same as what PKG-INFO says, so it just doesn't trust PKG-INFO. OTOH if we redefine PyPI's goal as being, figure out what's in PKG-INFO (or whatever replaces it), and declare that it's okay (for PyPI's purposes) if that doesn't match what the build system will eventually do, then that's a viable way forward.
reliably in a totally static way, without having to run arbitrary code to validate this. OTOH pip will always have to be prepared to handle the possibility of mismatch between what it was expecting based on the sdist metadata and what it actually got after building it, so we might as well acknowledge that in our mental model.
One potential advantage of this approach is that we might be able to talk ourselves into trusting the existing PKG-INFO as providing static metadata about the sdist, and thus PyPI at least could start trusting it for things like the "description" field, and if we define a new
The challenge is the 40K broken packages up there on PyPI. Basically pip has a bugfix for any of: sdists built using distutils sdists built using random build systems that don't understand what an sdist is (e.g. automake) sdists built using versions of setuptools that had a bug in this area
There is no corrective mechanism for broken packages other than route-around-it-while-you-ask-the-author-to-upload-a-fix.
IIUC what PyPI wants to do with PKG-INFO is read out stuff like the description and trove classifiers fields. Are there really 40K sdists on PyPI that have PKG-INFO files and where those files contain incorrect descriptions and so forth? I mean, obviously someone would have to check :-) But it seems unlikely, since almost everyone uploads by running 'sdist upload' or twine or something similarly automated.
So I think to tackle the 'please trust the metadata in the sdist' problem, one needs to have a graceful ramp-up of that trust with robust backoff mechanisms that don't involve 50% of PyPI users hating on that one old project in the corner everyone has a dep on but that is actually moribund and not doing uploads. I can imagine several such routes, including a crowdsourced blacklist - but its going to be (like we're dealing with with the automatic wheel cache already) years of bug reports until things age out.
sdist format then it would be possible to generate its static metadata from current setup.py files (e.g. by modifying setuptools's sdist command). Contrast this with the other approach, where getting any kind of static source-of-truth would require rewriting almost all existing setup.py files.
We already generate static metadata from current setup.py files: setup.py egg_info does precisely that. There, bug fixed ;).
I'm pretty sure that merely making it so 'setup.py sdist' created a file that contained the output from egg_info would not solve the current problem. That's pretty much exactly what the existing PKG-INFO *is*, isn't it? Yet apparently no-one trusts it.
The challenge, of course, is that there are a few places where pip actually does need to know something about wheels based on examining an sdist -- in particular name and version and (controversially) dependencies. But this can/should be addressed explicitly, e.g. by writing down a special rule about the name and version fields.
I'm sorry, I don't follow.
E.g., we can document that if you have a sdist foo-1.0, then pip and similar tools will expect this to generate a foo-1.0 wheel (but be prepared to do something sensible if this doesn't happen, like give an error message or whatever). That's really all pip needs, right? -n -- Nathaniel J. Smith -- http://vorpus.org