[Patches] [ python-Patches-652586 ] New import hooks + Import from Zip files
noreply@sourceforge.net
noreply@sourceforge.net
Fri, 13 Dec 2002 13:20:08 -0800
Patches item #652586, was opened at 2002-12-12 10:34
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=652586&group_id=5470
Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: New import hooks + Import from Zip files
Initial Comment:
This patch implements two things:
- a new set of import hooks, modelled after iu.py
- builtin support for imports from Zip archives (a
competing implementation for PEP 273)
The new set of hooks probably need a better document
explaining them (perhaps a PEP). My motivations have
been posted to python-dev.
Here's a brief description.
Three new objects are added to the sys module:
- path_hooks
- path_importer_cache
- meta_path
sys.path_hooks is a list of callable objects that take
a string as their only argument. A hook will be called
with a sys.path or pkg.__path__ item. It should return
an "importer" object (see below), or raise ImportError
or return None if it can't deal with the path item. By
default, sys.path_hooks only contains the zipimporter
type, if the zipimport module is available.
sys.path_importer_cache is a dict that caches the
results of sys.path_hooks to avoid repeated hook lookups.
sys.meta_path is a list of importer objects that are
invoked *before* the builtin import mechanism kicks in.
This allows overriding of builtin module and frozen
module import, but the main feature is that it allows
importer objects *without* a corresponding sys.path
item (just like builtin and frozen modules).
Importer objects must conform to the following protocol:
i.find_module(fullname) -> None or an importer object
i.load_module(fullname) -> the imported module (or
raise ImportError)
The 'fullname' is always the fully qualified module
name, ie. a dotted name for a submodule.
This patch adds one more feature: a sys.path item may
*itself* be an importer object. This is convenient for
experimentation, but using it may break third-party
code that assumes sys.path contains only strings.
----------------------------------------------------------------------
Comment By: Paul Moore (pmoore)
Date: 2002-12-13 21:20
Message:
Logged In: YES
user_id=113328
I'd say it's pretty much guaranteed *not* to be a good idea. I
am fairly certain that at least some Windows zip utilities fail
to have a separate zip entry for the directory itself.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-13 20:02
Message:
Logged In: YES
user_id=6380
Just: it looks like your latest version *requires* that the
entries for package directories are present. Is that really
a good idea?
----------------------------------------------------------------------
Comment By: James C. Ahlstrom (ahlstromjc)
Date: 2002-12-13 19:55
Message:
Logged In: YES
user_id=64929
The PEP 273 addition of the standard zip archive name
/usr/local/lib/python23.zip (or Windows name) to sys.path to
support import of the Python standard lib (including
site.py) from a zip archive is missing. That was the point
of my patch to site.py, getpath.c, getpathp.c, main.c,
pythonrun.c, and sysmodule.c. I believe those patches can
be used with this import.c if you want the feature.
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-13 19:29
Message:
Logged In: YES
user_id=92689
I've attached a new version that fixes a problem that Thomas
Heller noticed: package imports failed when the zip archive
didn't contain a separate entry for the package directory.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-13 17:52
Message:
Logged In: YES
user_id=6380
Thanks -- I'm reviewing this now (if not too many
distracting email arrives :-).
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 23:19
Message:
Logged In: YES
user_id=92689
Yet another new version:
- zipimport is a builtin module now; includes patches for
PC\config.c and PBbuild\pythoncore.dsp, contributed and
tested by Paul Moore.
- tweaked doc strings in zipimport.
I'll quit for today...
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 21:59
Message:
Logged In: YES
user_id=92689
I've uploaded a slightly updated version: the potential zlib
recursion has been fixed (I was in fact able to trigger it);
I've added a test for it, although it's tricky to do. Also
cleaned up a few exception messages in zipimport.
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 19:34
Message:
Logged In: YES
user_id=92689
I've uploaded a fresh version, with GC support added. Tested
to the extent that it doesn't crash ;-)
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 16:34
Message:
Logged In: YES
user_id=92689
I didn't know, thanks! I actually did find -N but it "didn't
work" ;-)
I've attached a new patch (slightly updated as well) as one
file. Also included is Lib/test/output/test_zipimport.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-12 14:28
Message:
Logged In: YES
user_id=33168
Just, in case you didn't know, you can do a cvs diff -N to
include new/removed files in a patch. I think you need to
do a cvs add/remove before -N works though.
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 14:05
Message:
Logged In: YES
user_id=92689
I've attached test_zipimport.py, a simple test script, to be
placed in Lib/test/.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-12-12 13:16
Message:
Logged In: YES
user_id=21627
recursive imports: having foo.zip as the first thing in sys.path,
and having a compressed zlib.py in there was indeed the
case I was thinking of (without actually trying).
GC: It is absolutely necessary. If this is not done now,
somebody will spend days of research five years from now,
because somebody else thought that invoking .update on this
files attribute was a good idea. This could be reduced to C-
level documentation if the dictionary was not exposed to
Python.
builtin: I think it ought to be builtin. It's a small module, it
does not rely on additional libraries, it is not maintained
externally, and it reduces bootstrap dependencies to have it
builtin. OTOH, zlib can't be builtin, as it relies on an
additional library which may not always be present.
get_long: you are right; the 0xff does make the values
positive, again. I somehow thought the result might still be
negative.
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 12:39
Message:
Logged In: YES
user_id=92689
Hm, regarding gc again: zipimporter objects can only
*theoretically* be involved in cycles, and only if people
muck with the "files" attribute. As it is, the "files" dict
only contains strings (keys) and tuples (values) which
contain strings and ints. So is it really neccesary?
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 12:35
Message:
Logged In: YES
user_id=92689
Martin:
- Have you actually seen a recursive import of zlib? I don't
see how it's possible to cause a stackoverflow. Hm, maybe if
someone stuffs a zlib.py in a zip archive? I'll add a guard.
- Are you really saying it *should* be a builtin module, or
was that comment supposed to start with "if"? See also
Paul's remark about zlib: if zlib remains a shared lib,
having zipimport as a builtin only helps for non-compressed
archives.
- gc: I'll give this a go (never done it before!)
- get_long() -> it's called with a signed char *, and the
buf[i] & 0xff should take care of the rest, so I'm not sure
I understand. But trust your judgement and changed it in my
working copy.
Thank you very much for the quick reply!
----------------------------------------------------------------------
Comment By: Paul Moore (pmoore)
Date: 2002-12-12 11:23
Message:
Logged In: YES
user_id=113328
I can provide details on how to patch the Windows build
process - it's not hard. The question of whether to have
zipimport as builtin or dynamic should be resolved first (the
processes are different in the 2 cases).
It should be pointed out that if zipimport is builtin, it still relies
on a dynamic module (zlib) for importing from compressed
zips. I don't know whether this affects the decision, though...
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-12-12 11:15
Message:
Logged In: YES
user_id=21627
Some comments:
- zipimport should normally be built as a builtin module, there
should also be a patch for the Windows build procedure
- zipimporters need to support garbage collection, as they
can occur in cycles
- there should be a mechanism to prevent a stack overflow in
case of recursive import of zlib.
- get_long needs to accept an unsigned char*
----------------------------------------------------------------------
Comment By: Just van Rossum (jvr)
Date: 2002-12-12 10:37
Message:
Logged In: YES
user_id=92689
btw. the attached file contains a patch for various files as
well as a new file: zipimporter.c. Place the latter in the
Modules/ directory.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=652586&group_id=5470