Changing the subject to clearly focus the discussion. On 30 January 2014 11:57, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
If you have other reasons for your -1, I'd like to hear them.
OK. Note that this is not, in my view, an issue with wheels, but rather about zipfiles on sys.path, and (deliberate) design limitations of the module loader and zipimport implementations.[1] First of all, it is not possible to load a DLL into a process' memory [2, 3] unless it is stored as a file in the filesystem. So any attempt to import a C extension from a zipfile must, by necessity, involve extracting that DLL to the filesystem. That's where I see the problems. None are deal-breaking issues, but they consist of a number of niggling issues that cumulatively chip away at the reliability of the concept until the end result has enough corner cases and risks to make it unacceptable (depending on your tolerance for risks - there's a definite judgement call involved). The issues I can see are: [4] 1. You need to choose a location to put the extracted file. On Windows in particular, there is no guaranteed-available filesystem location that can be used without risk. Some accounts have no home directory, some (locked down) users have no permissions anywhere but very specific places, even TEMP may not be usable if there's an aggressive housekeeping routine in place - but TEMP is probably the best choice of a bad lot. 2. There are race conditions to consider. If the extraction is not completely isolated per-process, what if 2 processes want to use different versions of the same DLL? How will these be distinguished? [5] So to avoid corner cases you have to assume only the one process uses a given extracted DLL. 3. Clean-up is an issue. How will the extracted files be removed? You can't unload the DLLs from Python, and you can't delete open files in Windows. So do you simply leave the files lying round? Or do you do some sort of atexit dance to run a separate process after the Python process terminates which will do the cleanup? What happens to that process when virus checkers hold the file open? Leaving the files around is probably the most robust answer, but it's not exactly friendly. As I've said elsewhere, these are fundamental issues with importing DLLs from zipfiles, and have no direct relationship to wheels. The only place where having a wheel rather than a general zipfile makes a difference is that a wheel *might* at some point contain metadata that allows the wheel to claim that it's "OK" to load its contents from a zipfile. But my points above are not something that the author of the C extension can address, so there's no way that I can see that an extension author can justifiably set that flag. So: as wheels don't give any additional reliability over any other zipfile, I don't see this (loading C extensions) as a wheel-related feature. Ideally, if these problems can be solved, the solution should be included in the core zipimport module so that all users can benefit. If there are still issues to iron out and experience to be gained, a 3rd party "enhanced zip importer" module would be a reasonable test-bed for the solution. A 3rd party solution could also be appropriate if the caveats and/or limitations were generally acceptable, but sufficient to prohibit stdlib inclusion. The wheel mount API could, if you wanted, look for the existence of that enhanced zipimport module and use it when appropriate, but baking the feature into wheel mount just limits your user base (and hence your audience for raising bug reports, etc) needlessly. I hope this explains my reasoning in sufficient detail. FINAL DISCLAIMER: I have no objection to this feature being provided per se, any more than I object to the existence of (say) Zope. Just because I'm not a member of the target audience doesn't mean that it's not a feature that some might benefit from. All I'm trying to do here is offer my input as someone who was involved in the initial implementation of zipimport, and who has kept an interested eye on how it has been used in the 11 years since its introduction - and in particular how people have tried to overcome the limitations we felt we had to impose when designing it. Ultimately, I would be overjoyed if someone could find a solution to this issue (in much the same way as I'm delighted by what Brett has done with importlib). Paul Footnotes: [1] Historical footnote - I was directly involved with the design of PEP 302 and the zipimport implementation, and we made a deliberate choice to only look at pure Python files, because the platform issues around C extensions were "too hard". [2] I'm talking from a Windows perspective here. I do not have sufficient low-level knowledge of Unix to comment on that case. I suspect that the issues are similar but I defer to the platform experts. [3] There is, I believe, code "out there" on the internet to map a DLL image into a process based purely in memory, but I think it's a fairly gross hack. I have a suspicion that someone - possibly Thomas Heller - experimented with it at one time, but never came up with a viable implementation. There's also TCL's tclkit technology, which *may* support binary extensions, and may be worth a look, but TCL has virtual filesystem support built in quite deep in the core, so how it works may not be applicable to Python. [4] I'm suggesting answers to the questions I'm raising here. The answers *may* be wrong - I've never tried to design a robust solution to this issue - but I believe the questions are the important point here. Please don't focus on why my suggested approach is wrong - I know it is! [5] To be fair, this is where the wheel metadata might help in distinguishing. But consider development and testing, where repeated test runs would not typically have different versions, but the user might well want to test whether running from zip still works. So wheel metadata helps, but isn't a complete solution. And compile data is probably just as good, so let's keep assuming we are looking at a general zipimport facility.