[Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning"

P.J. Eby pje at telecommunity.com
Thu Jul 14 01:14:18 CEST 2011

At 04:27 PM 7/13/2011 -0600, Eric Snow wrote:
>This is cool stuff.  And you have presented it really well.  I have
>some (probably too much) feedback inline.

Not at all too much; I've gone ahead and taken care of the typos you 
mentioned.  Other comments follow:

>Should there be a way to indicate that you do not want a directory to
>be considered for a package (an opt-out)?  Currently I can move the
>__init__.py out of the way and it gets ignored by import.

Renaming the directory is the quick solution.  If you have a tool 
that's looking for anything that's a package, then it'll need an 
exclusion option, or you'll have to rename the directory to something 
the tool will skip.  (Ideally, tools should skip directories that 
aren't valid Python identifiers.)

>I am looking at this PEP from the perspective that it may be useful,
>and not terribly difficult, to factor in meta importers.  So if that
>viewpoint is invalid a good chunk of my remaining comments may be
>irrelevant.  Also, I have been knee deep in importlib in the last few
>weeks, which will be painfully obvious in my feedback.  I apologize in
>advance.  <wink>

If you can provide a *use case* for explicitly making meta importers 
part of the process, then great.

However, even if they are, the hooks would probably be in the form of 
a *different* API for meta importers, that's called with a parent 
path as well as a module name, that would return a list of strings 
rather than an individual string.  The virtual path creation process 
would then walk the meta importers first, calling that method, until 
it got a non-empty list, or until it had to fall back to doing it 
itself (in the way described by the PEP).

In the importlib case, then, you could just implement that method 
(say, "build_virtual_path()") on the default meta importer.  (Which 
would also implement the virtual package fallback, or leave it to 
another meta-importer later on the path.)

Anyway, that, as far as I can tell, is the only sane way to make meta 
importers participate in the virtual path building process, and IMO 
it's an extension that isn't really needed at the moment, and would 
complicate the specification in the PEP.

That being said, if somebody wanted to implement the additional 
feature in importlib "off the books", it's not going to break 
anything.  ;-)  We can always update the PEP afterwards.

Seriously, though, I suppose we could add a note saying it could be 
done, and should be done if anybody has use cases, but we're not 
spelling it out at the moment.

>sys.path is used here instead of as the default arg so that it gets
>evaluated each time?

Yes.  That's normal for ``imp`` APIs.

>Or in importlib...

Well, I don't really want to tie the PEP to importlib right now, and 
``imp`` is the established point for exposing the machinery Python is 
actually using.  But of course, I'm not the one doing the work.  ;-)

> > * A new ``get_subpath(importer, fullname)`` generic function, allowing
> >  implementations to be registered for existing importers.
>Not that it necessarily impacts this PEP, but I'm not sure what you
>mean by "registered for existing importers".  I am guessing that
>pkgutil is used to facilitate behaviors in packaging libraries, like
>setuptools, and that this registration is one of those behaviors.
>Then again I am a little dense sometimes <wink>.

I just killed that entire bullet.  The truth is, it really only 
mattered for 2.x, where it can't really help anyway.  So, I've 
dropped it from the spec.

>As I already noted, this is pretty specific to the default file import
>mechanism rather than the more general meta import process.  Maybe
>that's all that is needed?  My sense of extending virtual paths is
>pretty fuzzy.

Meta importers are for implementing alternative import strategies, 
rather than being one more step along the way in a standard 
import.  You could, for example, implement "pure virtual" lookup as a 
meta importer that sits *after* the one that does Python's normal 
sys.path/__path__ searching.  (And that might well be the way to do 
it in importlib.)

> > * ``sys.virtual_packages`` is allowed to contain non-existent or
> >  not-yet-imported package names; code that uses its contents should
>If it where a dict the module name could point to None, rather than to
>the responsible meta importer.

Let's see if there are any use cases for meta importer participation 
before we go down that route.  Outside of importlib and my sketch of 
a 2.x implementation for PEP 382, just how many meta importers 
*exist* in the outside world, after nearly nine years of PEP 302 
being in existence?

>The "optional extensions" section of PEP 302 has a bit about a
>get_data() method for importers.  Using get_data() instead of __file__
>or __path__ seems like a safer operation, much as you recommended
>using pkgutil.walk_modules() above.
>In the case of importlib (yes, it's on my mind), get_data() is already
>implemented for the finders surrounding _DefaultPathFinder.  I am not
>familiar with the importers that are currently used on
>sys.path_importer_cache, but maybe they provide get_data() too?  (a
>cursory look makes me think so)

I didn't bother with explaining this much because the 
``pkg_resources`` module provided by setuptools takes care of 
interfacing with these things to give you a friendly API for 
retrieving strings, streams, or filenames for module-adjacent data files.

>Certainly that is a simpler approach, but it seems like each
>find_module() implementation would end up doing it pretty much the
>same way, following the pattern used by the sys.path handler.
>However, you are probably right that handling just the sys.path stuff
>is good enough.

Again, if somebody can point to a meta importer that's *not* part of 
importlib, we can take a look at that.  ;-)

>* sys.virtual_packages being a list vs. a dictionary

Er, it's a set, not a list.  I'll change the bit that says that to 
highlight ``set()`` as a built-in type, vs. just the word "set".

>And only one thing seems ambiguous when meta importers are left for
>later.  If a module is loaded through a meta importer, which importer
>handles a get_path() call?  When extend_virtual_paths is called, how
>are meta-imported modules addressed?

That's really up to the meta-importer.  You're really not supposed to 
use meta-importers to represent import *locations*; they're for 
extending or replacing import *policies*.  If you need locations, you 
make up a string to represent the location and put it in sys.path, 
after adding a path hook that recognizes the corresponding string.

That's why the whole idea of treating a meta importer as if it were a 
regular path entry importer is bogus: if you wanted to just implement 
another search location, you should just use a path entry importer; 
you don't need a meta-importer at all.

To put it another way, if write a meta-importer, then you really do 
need to consider what way you'll do ``__path__`` building, and part 
of the point of doing so in a meta-importer would be so that you 
could *change* the way it was done.  So why would you want to be 
called as part of a protocol that you're probably going to replace, anyway?

>One last point:  This PEP results in two ways to provide a module for
>a package (<NAME>.py in addition to <NAME>/__init__.py).  However, you
>do offer a good distinction; __init__.py is for "self-contained"
>packages.  Is it clear when to use which?  Will __init__.py go away
>after a while?  Will we have to start looking in two places for a
>package's code?

I'll add something on that to the notes section:

* While virtual packages are easy to set up and use, there is still
   a time and place for using self-contained packages.  While it's not
   strictly necessary, adding an ``__init__`` module to your
   self-contained packages lets users of the package (and Python
   itself) know that *all* of the package's code will be found in
   that single subdirectory.  In addition, it lets you define
   ``__all__``, expose a public API, provide a package-level docstring,
   and do other things that make more sense for a self-contained
   project than for a mere "namespace" package.

More information about the Import-SIG mailing list