[Python-Dev] Writing importers and path hooks

Paul Moore p.f.moore at gmail.com
Thu Mar 28 17:33:23 CET 2013


On 28 March 2013 16:08, Brett Cannon <brett at python.org> wrote:
> You only need SourceLoader since you are dealing with Python source. You
> don't need FileLoader since you are not reading from disk but an in-memory
> zipfile.
>
> You should be implementing get_data, get_filename, and path_stats for
> SourceLoader.

OK, cool. That helps a lot.

The biggest gap here is that I don't think that anywhere has a good
explanation of the required semantics of get_filename - particularly
where we're not actually dealing with real filenames. My initial stab
at this would be:

A module name is a dot-separated list of parts.
A filename is an arbitrary token that can be used with get_data to get
the module content. However, the following rules should be followed:
- Filenames should be made up of parts separated by the OS path separator.
- For packages, the final section of the filename *must* be
__init__.py if the standard package detection is being used.
- The initial part of the filename needs to match your path entry if
submodule lookups are going to work sanely

In practice, you need to implement filenames as if your finder is
managing a virtual filesystem mounted at your sys.path entry, with
module->filename semantics being the usual subdirectory layout. And
packages have a basename of __init__.py.

I'd like to know how to implement packages without the artificial
__init__.py (something like a sqlite database can attach content and
an "is_package" flag to the same entry). But that's advanced usage,
and I can probably hack around until I work out how to do that now.

>> The documentation on what I
>> need to return from there is very sparse... In the end I worked out
>> that for a package, I need to return (MyLoader(modulename,
>> 'foo/__init__.py'), ['foo']) (here, "foo" is my dummy marker for my
>> example).
>
> The second argument should just be None: "An empty list can be used for
> portion to signify the loader is not part of a [namespace] package".
> Unfortunately a key word is missing in that sentence.
> http://bugs.python.org/issue17567

Ha. Yes, that makes a lot of difference :-) Did you mean None or [], by the way?

>> In essence, PathEntryFinder really has to implement some
>> form of virtual filesystem mount point, and preserve the standard
>> filesystem semantics of modules having a filename of .../__init__.py.
>
> Well, if your zip file decided to create itself with a different file
> extension then it wouldn't be required, but then other people's code might
> break if they don't respect module abstractions (i.e. looking at
> __package__/__name__ or __path__ to see if something is a package).

I'm not quite sure what you mean by this, but I take your point about
making sure to break people's expectations as little as possible...

>> So I managed to work out what was needed in the end, but it was a lot
>> harder than I'd expected. On reflection, getting the finder semantics
>> right (and in particular the path entry finder semantics) was the hard
>> bit.
>
> Yep, that bit has had the least API tweaks as most people don't muck with
> finders but with loaders.

Hmm. I'm not sure how you can ever write a loader without needing to
write an associated finder. The existing finders wouldn't return your
loader, surely?

>> I'm now 100% sure that some cookbook examples would help a lot. I'll
>> see what I can do.
>
> I plan on writing a pure Python zip importer for Python 3.4 which should be
> fairly minimal and work out as a good example chunk of code.  And no one
> need bother writing it as I'm going to do it myself regardless to make sure
> I plug any missing holes in the API. If you really want something to try for
> fun go for a sqlite3-backed setup (don't see it going in the stdlib but it
> would be a project to have).

I'm pretty sure I'll write a zip importer first - it feels like one of
those essential but largely useless exercises that people have to
start with - a bit like scales on the piano :-) But I'd be interested
in trying a sqlite importer as well. I might well see how I go with
that.

Thanks for the help with this.
Paul


More information about the Python-Dev mailing list