Nick Coghlan <ncoghlan <at> gmail.com> writes:
I believe Paul's concern is with anything that suggests that arbitrary *third party* code can be run from wheel files, when the reality is that it is fairly easy to accidentally write code that assumes it is installed on the filesystem in a way that isn't easy for a quick scan of the files in the zip archive to detect (especially since the PEP 376 installation database PEP doesn't include any support for arbitrary metapath importers).
That is a valid concern, but no one is suggesting that arbitrary third party code can run from wheel files, just as zipimport makes no guarantees about zipped code working.
By contrast PEP 441 is a *distribution* utility - the creator of the application is expected to ensure that doing so actually works correctly before publishing their app that way, just as we would expect py2exe, py2app and cx-freeze users to do.
True, but there's no reason why wheels couldn't have some metadata indicating that this diligence has been exercised by the wheel creator.
With the "reference implementation" position that distlib is likely to occupy in a post-PEP-426/440/459 world, though, there's an additional legitimate concern about allowing end users to easily distinguish between "this API is fully supported by the PyPA as part of the reference implementation for metadata 2.0" and "this is an experimental packaging related API that may or may not be useful in general, and some members of the PyPA may still have grave reservations about it".
At the moment, distlib contains both kinds of API, and it confuses *us*, let alone anyone else that isn't closely following along on distutils-sig. As long as distlib is serving the dual role of providing both "the reference implementation for metadata 2.0" and "some experimental packaging related APIs", we're going to get concerns like this one arising. If there was a clear way to distinguish them (ideally with a separate project for either the reference implementation or the experimental stuff, but even a distinct namespace within the distlib project would help a great deal), I suspect there would be less concern.
These are social concerns perhaps more than technical concerns, and to me they lack specificity. Of course some of the APIs in distlib are new and untried-except-by-me, but the way to allay concerns is to focus on specifics, force out the details of the concerns and then see how best they can be addressed. This is not doable with "zipped-eggs-were-bad" rhetoric. Details generally help to identify what the real problem is. For example, Donald raised the spectre of security vulnerabilities with his mention of Mitre and CVEs, but there were no specifics beyond that. I found a discussion where someone had set PYTHON_EGG_CACHE to /tmp. I can certainly see the negative security implications of that, but the finger was pointed at the using applications rather than setuptools. Even though setuptools specifically added code as a remedy to warn when the env var pointed to a world-writeable directory, this was seen as trying to be helpful rather than patching a vulnerability. Of course, if I've misunderstood something in that discussion or missed some other security issue, then some pointers would help move the discussion along.
In the specific case of distlib.mount, if it's eventually combined with a metadata extension like "distlib.mount" which packages must export in order for the command to allow them to be automatically used that way, then I don't see anything wrong with it *in general* - it's a natural extension of the setuptools "zip_safe" flag, but with the ability to include additional details (like whether or not there are C extensions that need to be automatically extracted).
Are you talking just about adding wheels to sys.path, or do you mean the extension-extraction stuff? Note that distlib's Wheel.mount does a compatibility check and addition to sys.path, which I feel is not especially controversial and better than just adding to sys.path, which user code can now do, anyway. But nothing else happens, unless specific metadata is provided in the wheel to enable it. While it's not specifically a "distlib.mount" export, there is a facility to ask for extensions to be extracted, and the in absence of metadata asking for this, no extraction occurs.
goes further than the current EXTENSIONS approach - this proposal would be akin to *requiring* an empty EXTENSIONS file, and/or the setuptools zip_safe flag in order to allow mounting of even the pure Python wheel. Such a conservative approach is also the antithesis of the setuptools "attempt to guess": if the package publisher doesn't explicitly opt in to zip support, then distlib.mount would assume that it is *not* supported (but may provide an API for the caller to override that, like "assume_zip_safe=True" or "force=True").
I have no problem with adding wheel metadata to allow/disallow even adding to sys.path - it's effectively just like another step in the compatibility check. It would make most sense to place this in the WHEEL metadata, rather than pydist.json or similar, since it relates to the contents of a particular wheel rather than the distribution in general.
However, like Paul, I have some concerns about a still experimental API like that being in the metadata 2.0 reference implementation, since that will likely end up having to deal with stdlib-like levels of backwards compatibility requirements, and removing experimental APIs that we later decided we weren't happy with could prove problematic.
But we're talking about the Python 3.5 time-frame here, and 3.4 isn't even out yet. ISTM there is plenty of time to get these sorts of issues ironed out. While I tend to favour backward compatibility wherever possible, distlib is nowhere near 1.0, and so distlib users (a small number, from what I can see) could expect some API breakage if there's no sensible alternative. Regards, Vinay Sajip