[Distutils] Using Wheel with zipimport

Paul Moore p.f.moore at gmail.com
Wed Jan 29 12:46:51 CET 2014


On 29 January 2014 10:36, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
> Paul Moore <p.f.moore <at> gmail.com> writes:
>
>> 1. It does *not* just use the fact that wheels are importable. It goes
>> beyond that and *extracts* C extensions to make them importable, too.
>> That is a workaround for a known and accepted limitation of zipimport,
>> and as a workaround, it has issues. If C extensions in zipfiles could
>> work reliably, this should go into zipimport itself, and *not* into
>> 3rd party code. Then everyone would benefit.
>
> (a) The mount method does not do the extraction of C extensions
>     *unconditionally* - only when there is suitable indication in the
>     metadata.

Suitable indication of what, exactly? It's not possible (at an OS
level) to link to a DLL in a zipfile, so if there's a DLL in there and
it's used, it needs to be extracted. I don't really see much point
hairsplitting over whether that counts as "unconditional" or not.

> (b) What are these issues to which you refer? As I said in my other
>     response, I don't see the zipped-egg problem as the same, because that
>     sys.path manipulation is under the hood/not under the developer's
>     control.

I did point out that I knew I was being non-specific. But I know
people have reported problems in the past with attempts (and there
have been many) to do this. Getting cleanup right is one area. Picking
a suitable location to extract to is another.

But conceded, this is anecdotal. Feel free to ignore it if that's what
you want to do.

> (c) Why can't third-party code break new ground, which may or may not
>     prove fertile? If considered beneficial, it could always be added
>     to zipimport at a later date. That's how a lot of functionality has
>     entered the stdlib, after all.

This is *not* new ground. Gordon McMillan's importer did it years ago,
back when zipimport was first implemented. Eggs did it in
pkg_resources. I think py2exe explicitly decided *not* to do it
because of known issues with Gordon McMillan's implementation (feel
free to call this anecdotal again if you want - but you may be able to
find the details with some research).

If this was really something that had never been tried before, I'd be
happy with it as an "experimental" API. But it's neither marked as
experimental, nor is it new. Your implementation *may* have found some
new approach - but from your descriptions, I'm not sure it has, and
you seem not to have been aware of or researched the previous
attempts, so I'm guessing you're not aware of the prior art here.

>> 2. It makes what should be a rare use case, to be used only when the
>> code in the wheel has been carefully checked to work from a zipfile,
>> seem like a common and straightforward operation. (The "attractive
>> nuisance" argument). I believe that people using this API will
>> typically *not* check the code, and will blame the wheel format, or
>> distlib, when their application does not work as expected.
>
> Doesn't the same argument apply to zipimport? Any code which is zipimported
> shouldn't use __file__ manipulation to access package resources, for
> example. Are you saying that you also feel wheels should never be
> importable, since one can't guarantee whether any code in general will
> work correctly from a zip? And yet, we have PEP 441, which is intended to
> encourage use of zipimport. Do we say, "no point - there's bound to be
> people out there who'll use __file__ instead of pkgutil"?

No, I'm saying that hiding the subtleties may be doing people a
disservice. That's why I described it as an "attractive nuisance". I'm
fully aware that this is a judgement call, and one person's
convenience API is another's disaster waiting to happen.

Again, I'me fine with your opinion not matching mine.

>> 3. It is no easier than sys.path.insert(0, wheelname). All it adds
>> over that is compatibility checking and the ability to import C
>> extensions (see above on why I think that's a bad thing). As for
>> compatibility checking, I'd prefer a distlib.wheel.check_compatibility
>> API that people could call *before* manually inserting the wheel onto
>> sys.path. That's a better separation of concerns, in my view.
>
> There's no problem with providing a compatibility check API, but the
> other POV is that people might forget to call that API, so it seems to
> make sense to call it from the mount method too. And I don't see why you
> say "all it adds" - surely the check is important to prevent a certain
> class of "weird things" happening.

OK, I misspoke. All that mount adds over a compatibility API is a
sys.path.insert call (*if* you agree with my point that the C
extension stuff shouldn't be used - and I know you don't).

Breaking mount down:

1. Compatibility check - I support this and would like a separate API
2. sys.path modification - trivial for anyone who knows enough to be
using this API
3. C extension support - we'll have to agree to disagree here

>> I can't give specific examples of "failure modes" because I don't use
>> the wheel mount functions, nor do I typically add wheels to sys.path.
>> When I did (in virtualenv) I hit a number of issues, but all of these
>> were ones I fixed in user code (either in virtualenv itself, or in
>> pip, which was the wheel I was importing). So you could reasonably
>
> When I looked into running pip from a zip, it was clear that the issues
> were related to pip's use of __file__ and such, and nothing to do
> with shortcomings in the wheel format or zipimport.

You're agreeing with me here - I just *said* you can reasonably
dismiss these as not related to the mount API.

>> dismiss these as "not related to the mount API" to which all I can say
>> is that if I'd been able to do wheel.mount(wheelname) I would likely
>> have put less thought into whether what I was doing was a good idea -
>> and *that's* what I think is the bad aspect of the API.
>
> Again, this feels like saying "zipimport is bad because people use
> __file__ instead of pkgutil". Having any API available doesn't absolve
> people from the responsibility of thinking about what they're doing and
> what the implications of using that API are.

You're missing my point - zipimport is not bad because people use
__file__ - pkgutil is *good* because it gives people a means of
getting round some of the limitations of zipimport.

But it's all layers. Convenience APIs are *abstractions*. They make
things *convenience* - and one aspect of that convenience is being
able to avoid bothering about some of the details. My concern is that
the mount API makes it too easy for people not to think about whether
what they are mounting is mountable. If the API documentation clearly
stated what was mountable then this would not be an issue (but the
docs would be huge and scary). But the API wouldn't look as
attractive, either.

>> Ultimately, I'll just never use the distlib mount functionality, and
>> I'll recommend not using it when (if) people ask. But I'd rather it
>> were not there to prompt the question.
>
> You don't want it to be there, even if it might be useful to others,
> just because it isn't useful to you? It's not as if distlib is forcing
> that functionality on anyone.

No, no, no. Maybe you asked for facts and I gave opinions. If so I
apologise. But what I'm saying is that I feel that the API might
encourage people to make mistakes they would not otherwise have
considered, and that's a shame. If the API is there, personally I'll
just never use it. So I don't *care* as such.

Let's just agree to disagree. It's a pretty small point in the overall picture.

Cheers,
Paul.


More information about the Distutils-SIG mailing list