[Distutils] Using Wheel with zipimport

Donald Stufft donald at stufft.io
Wed Jan 29 15:22:21 CET 2014

On Jan 29, 2014, at 8:59 AM, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:

> Nick Coghlan <ncoghlan <at> gmail.com> writes:
>> I believe Paul's concern is with anything that suggests that arbitrary
>> *third party* code can be run from wheel files, when the reality is
>> that it is fairly easy to accidentally write code that assumes it is
>> installed on the filesystem in a way that isn't easy for a quick scan
>> of the files in the zip archive to detect (especially since the PEP
>> 376 installation database PEP doesn't include any support for
>> arbitrary metapath importers).
> That is a valid concern, but no one is suggesting that arbitrary third
> party code can run from wheel files, just as zipimport makes no guarantees
> about zipped code working.
>> By contrast PEP 441 is a *distribution* utility - the creator of the
>> application is expected to ensure that doing so actually works
>> correctly before publishing their app that way, just as we would
>> expect py2exe, py2app and cx-freeze users to do.
> True, but there's no reason why wheels couldn't have some metadata
> indicating that this diligence has been exercised by the wheel creator.
>> With the "reference implementation" position that distlib is likely to
>> occupy in a post-PEP-426/440/459 world, though, there's an additional
>> legitimate concern about allowing end users to easily distinguish
>> between "this API is fully supported by the PyPA as part of the
>> reference implementation for metadata 2.0" and "this is an
>> experimental packaging related API that may or may not be useful in
>> general, and some members of the PyPA may still have grave
>> reservations about it".
>> At the moment, distlib contains both kinds of API, and it confuses
>> *us*, let alone anyone else that isn't closely following along on
>> distutils-sig. As long as distlib is serving the dual role of
>> providing both "the reference implementation for metadata 2.0" and
>> "some experimental packaging related APIs", we're going to get
>> concerns like this one arising. If there was a clear way to
>> distinguish them (ideally with a separate project for either the
>> reference implementation or the experimental stuff, but even a
>> distinct namespace within the distlib project would help a great
>> deal), I suspect there would be less concern.
> These are social concerns perhaps more than technical concerns, and to
> me they lack specificity. Of course some of the APIs in distlib are
> new and untried-except-by-me, but the way to allay concerns is to focus
> on specifics, force out the details of the concerns and then see how best
> they can be addressed. This is not doable with "zipped-eggs-were-bad"
> rhetoric. Details generally help to identify what the real problem is. For
> example, Donald raised the spectre of security vulnerabilities with his
> mention of Mitre and CVEs, but there were no specifics beyond that.
> I found a discussion where someone had set PYTHON_EGG_CACHE to /tmp. I can 
> certainly see the negative security implications of that, but the finger
> was pointed at the using applications rather than setuptools. Even
> though setuptools specifically added code as a remedy to warn when the
> env var pointed to a world-writeable directory, this was seen as trying to
> be helpful rather than patching a vulnerability. Of course, if I've
> misunderstood something in that discussion or missed some other
> security issue, then some pointers would help move the discussion along.

Mitre’s rules for CVEs are not entirely obvious to people who are not
familiar with them. Generally if the feature *can* be used securely
or there was no evidence that the author intended that the code be
secure they will not issue a CVE. The issue is that the feature makes
a very attractive footgun for people using it to do the wrong thing
and have it be a very bad idea. 

>> In the specific case of distlib.mount, if it's eventually combined
>> with a metadata extension like "distlib.mount" which packages must
>> export in order for the command to allow them to be automatically used
>> that way, then I don't see anything wrong with it *in general* - it's
>> a natural extension of the setuptools "zip_safe" flag, but with the
>> ability to include additional details (like whether or not there are C
>> extensions that need to be automatically extracted).
> Are you talking just about adding wheels to sys.path, or do you mean
> the extension-extraction stuff? Note that distlib's Wheel.mount does a
> compatibility check and addition to sys.path, which I feel is not
> especially controversial and better than just adding to sys.path,
> which user code can now do, anyway. But nothing else happens, unless
> specific metadata is provided in the wheel to enable it. While it's not
> specifically a "distlib.mount" export, there is a facility to ask for
> extensions to be extracted, and the in absence of metadata asking for this,
> no extraction occurs.
>> goes further than the current EXTENSIONS approach - this proposal
>> would be akin to *requiring* an empty EXTENSIONS file, and/or the
>> setuptools zip_safe flag in order to allow mounting of even the pure
>> Python wheel. Such a conservative approach is also the antithesis of
>> the setuptools "attempt to guess": if the package publisher doesn't
>> explicitly opt in to zip support, then distlib.mount would assume that
>> it is *not* supported (but may provide an API for the caller to
>> override that, like "assume_zip_safe=True" or "force=True").
> I have no problem with adding wheel metadata to allow/disallow even
> adding to sys.path - it's effectively just like another step in the
> compatibility check. It would make most sense to place this in the
> WHEEL metadata, rather than pydist.json or similar, since it relates to
> the contents of a particular wheel rather than the distribution in
> general.
>> However, like Paul, I have some concerns about a still experimental
>> API like that being in the metadata 2.0 reference implementation,
>> since that will likely end up having to deal with stdlib-like levels
>> of backwards compatibility requirements, and removing experimental
>> APIs that we later decided we weren't happy with could prove
>> problematic.
> But we're talking about the Python 3.5 time-frame here, and 3.4 isn't even
> out yet. ISTM there is plenty of time to get these sorts of issues ironed
> out. While I tend to favour backward compatibility wherever possible,
> distlib is nowhere near 1.0, and so distlib users (a small number, from
> what I can see) could expect some API breakage if there's no sensible
> alternative.
> Regards,
> Vinay Sajip
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20140129/51daa9ff/attachment-0001.sig>

More information about the Distutils-SIG mailing list