From: Just van Rossum [mailto:just@letterror.com]
Neal Norwitz wrote:
On Tue, Dec 03, 2002 at 11:48:29AM -0000, Moore, Paul wrote:
My view is that import.c is, indeed, very complex and difficult code to follow. I agree with Just's comments that a cleanup in import.c would be useful.
I agree with both of these statements. I have reviewed the patch. While it's quite involved, I don't think it's worth rejecting I would really like to see a cleanup, though
I'm working hard on a zipimport module that uses sys.import_hooks. It's going to be SO much cleaner than the existing zip import patch.
That's great news, as I've said before. Can I ask that you consider a few issues, though. I don't have particular use cases in mind, but I believe that PEP 273 had a number of implicit assumptions, which your proposal does not incorporate. I'd like to make the differences explicit, just so that people get a chance to comment on whether they care. 1. Your import hooks seem to act at a level "above" sys.path, in the sense that the normal sys.path search becomes just another hook. With that in mind, I don't see a way for a user to add a zipfile in the *middle* of sys.path. 2. The syntax needed to add a zipfile search is different from the familiar one for sys.path additions. This isn't an issue as such, but may make the feature a little less accessible. 3. By using a non-sys.path approach, you make it impossible for users to set PYTHONPATH to include zipfiles. Can I again ask you to review Gordon McMillan's import manager. I really feel that you're reinventing the wheel here, and Gordon's code is well tested, as it's part of his Installer package. Maybe Gordon's code (which is pure Python) isn't appropriate to use, but I'm sure the design is worth looking at.
Rejecting the patch would be worthwhile if the interface would need to change after incorporating this patch. I don't believe this is the case. Whether this patch is accepted or not wouldn't change future patches that Just wants to incorporate.
I personally reject the patch because it modifies import.c beyond repair. Once this stuff is in, there's NO way we can go back without a huge effort, meaning it won't happen.
I think you're exaggerating. It makes things worse, certainly, but I'd strongly dispute your claim of "beyond repair". I simply don't think the change is that major. Either it's already beyond repair, or the patched version is still fixable. And nobody is proposing "going back". A refactoring is "going forward". Whether it's a refactoring of code that *already* supports zipfiles or a refactoring which then has to be changed to add zipfile support shouldn't matter.
It adds a whole directory listing caching scheme that is a) totally unneeded and b) has (IMO) _nothing_ to do with zip importing per se. That ExternalNames/External names stuff? I don't understand why it is needed just to enable zip imports. It's a lot of stuff!
I've explained this a couple of times now. Please read the PEP for details, but Jim claims that zipfile imports need this if they are not to be too slow. A directory lookup in a zipfile is basically a linear search through the list of all files in the zipfile. For a large zipfile (such as the Python library) this is going to be *slow*. Please benchmark your implementation on a zipped version of the full Python library, replacing the directory version, and compare with Jim's benchmarks. While speed isn't the be-all and end-all of this patch, I'd hate to lose a significant speed gain... And the cache has a nice side-effect in that it speeds up general imports, particularly on networked drives.
I think that the zipfile patch is close to being acceptable. I don't have a problem if it gets rejected, but can I make a plea - if it does, can it be "Rejected because patch XXX does the job better" rather than "Rejected because this can be implemented better using feature YYY".
Agreed.
And that's ecxactly what I intend to say in a couple of days if not sooner.
Great! Will your code address all of the above points, and the issues raised on python-dev in Oct/Nov 2001 (see below for references)?
[ ... ]
If Just incorporates his import patch, perhaps the zipfile code could be refactored to use the more generic framework. I don't view these two patches as mutually exclusive.
I *totally* do.
If all of these issues are addressed, the resulting code could be clearer, even if intermediate versions are not.
A rewrite now is MUCH easier than cleaning it up later. It's going to be much less code. Wait and see.
Here are some references that anyone looking at this stuff should read. There was a *lot* of discussion on this patch in Oct/Nov 2001. Start of the PEP 273 thread: http://mail.python.org/pipermail/python-dev/2001-October/018158.html Thread on caching of lookups http://mail.python.org/pipermail/python-dev/2001-November/018346.html Thomas Heller proposing Gordon McMillan's code for the standard library: http://mail.python.org/pipermail/python-dev/2001-October/017750.html As I've said, I'm not against rejecting the current patch - but I'd be concerned if all of the work that went into it (ie, Jim's design and discussions on python-dev, *not* the specific code implementing it) was just lost. Paul.