[Python-Dev] New Import Hooks PEP, a first draft (and req. for PEP #)

Just van Rossum just@letterror.com
Sat, 21 Dec 2002 15:38:39 +0100


Paul Moore wrote:

> > Note that for a zip file, __file__ is something like
> >       C:/Python/lib/zarchive.zip/subdir/mymod.pyc
> > and nothing is going to make code using __file__ Just Work.
> 
> That's not true. Under your patch, sure that's what __file__
> is. Offhand, I'm not sure what Just's zipimporter code puts in there,

It actually does the same thing as Jim's patch regarding __file__. But
he's right in both cases: using __file__ is will not "Just Work" ;-)

> and I certainly wouldn't guarantee it for an arbitrary implementation
> of zip imports. Frozen and builtin modules don't have usable __file__
> attributes, and for something like a hook loading modules from a
> database, there is no meaningful __file__ value.

Indeed.

[ ... ]
> That's what the get_data(name) method Just is proposing is supposed to
> address. The only difficulty with it is pinning down what the "name"
> argument means. At the lowest level, it's an arbitrary cookie which
> identifies a chunk of data. The trick is to avoid making that cookie
> *too* unrelated to how things work in the filesystem...

I just wrote this to Paul in private mail:

    The 'name' argument of i.get_data(name) should be seen as a
    'cookie', meaning the importer protocol doesn't prescribe any
    semantics for it. However, for importer objects that have some
    file system-like properties (for example zipimporter) it is
    recommended to use os.sep as a separator character to specify a 
    (possibly virtual) directories hierarchy. For example if the 
    importer allows access to a module's source code via 
    i.get_data(name), the 'name' argument should be constructed like 
    this:

        name = mod.__name__.replace(".", os.sep) + ".py"

    Note that this is not the recommended way to retrieve source code,
    the (optional) method i.get_source(fullname) is more general, as
    it doesn't imply *any* file-system-like characteristics.

But in the light of Jack's remark regarding MacOS<X pathnames it might
be better to stick with '/' instead of os.sep. This is not a real file
system path, so it seems odd to enforce platform-specific path
semantics. From Jack's post:

> > Much better. I think I'd prefer the first, mostly because
> > os.path.join() might do more magic than needed.
> 
> But that magic would actually be needed for MacOS9 pathnames.
> os.path.join(*['foo', 'bar']) will correctly return ':foo:bar',
> whereas os.sep.join will return 'foo:bar', which is wrong.

"needed" and "wrong" are highly questionalble in the context of
importer.get_data()...

Just