[Python-Dev] PEP 402: Simplified Package Layout and Partitioning

P.J. Eby pje at telecommunity.com
Fri Aug 12 17:24:57 CEST 2011


At 02:02 PM 8/11/2011 -0400, Glyph Lefkowitz wrote:
>Rather than a one-by-one ad-hoc consideration of which attribute 
>should be set to None or empty strings or "<string>" or what have 
>you, I'd really like to see a discussion in the PEP saying what a 
>package really is vs. what a module is, and what one can reasonably 
>expect from it from an API and tooling perspective.

The assumption I've been working from is the only guarantee I've ever 
seen the Python docs give: i.e., that a package is a module object 
with a __path__ attribute.  Modules aren't even required to have a 
__file__ object -- builtin modules don't, for example.  (And the 
contents of __file__ are not required to have any particular 
semantics: PEP 302 notes that it can be a dummy value like 
"<frozen>", for example.)

Technically, btw, PEP 302 requires __file__ to be a string, so making 
__file__ = None will be a backwards-incompatible change.  But any 
code that walks modules in sys.modules is going to break today if it 
expects a __file__ attribute to exist, because 'sys' itself doesn't have one!

So, my leaning is towards leaving off __file__, since today's code 
already has to deal with it being nonexistent, if it's working with 
arbitrary modules, and that'll produce breakage sooner rather than 
later -- the twisted.python.modules code, for example, would fail 
with a loud AttributeError, rather than going on to silently assume 
that a module with a dummy __file__ isn't a package.   (Which is NOT 
a valid assumption *now*, btw, as I'll explain below.)

Anyway, if you have any suggestions for verbiage that should be added 
to the PEP to clarify these assumptions, I'd be happy to add 
them.  However, I think that the real problem you're encountering at 
the moment has more to do with making assumptions about the Python 
import ecosystem that aren't valid today, and haven't been valid 
since at least the introduction of PEP 302, if not earlier import 
hook systems as well.


>  But the whole "pure virtual" mechanism here seems to pile even 
> more inconsistency on top of an already irritatingly inconsistent 
> import mechanism.  I was reasonably happy with my attempt to paper 
> over PEP 302's weirdnesses from a user perspective:
>
><http://twistedmatrix.com/documents/11.0.0/api/twisted.python.modules.html>http://twistedmatrix.com/documents/11.0.0/api/twisted.python.modules.html
>
>(or <https://launchpad.net/modules>https://launchpad.net/modules if 
>you are not a Twisted user)
>
>Users of this API can traverse the module hierarchy with certain 
>expectations; each module or package would have .pathEntry and 
>.filePath attributes, each of which would refer to the appropriate 
>place.  Of course __path__ complicates things a bit, but so it goes.

I don't mean to be critical, and no doubt what you've written works 
fine for your current requirements, but on my quick attempt to skim 
through the code I found many things which appear to me to be 
incompatible with PEP 302.

That is, the above code hardocdes a variety of assumptions about the 
import system that haven't been true since Python 2.3.  (For example, 
it assumes that the contents of sys.path strings have inspectable 
semantics, that the contents of __file__ can tell you things about 
the module-ness or package-ness of a module object, etc.)

If you want to fully support PEP 302, you might want to consider 
making this a wrapper over the corresponding pkgutil APIs (available 
since Python 2.5) that do roughly the same things, but which delegate 
all path string inspection to importer objects and allow extensible 
delegation for importers that don't support the optional methods involved.

(Of course, if the pkgutil APIs are missing something you need, 
perhaps you could propose additions.)


>Now it seems like pure virtual packages are going to introduce a new 
>type of special case into the hierarchy which have neither 
>.pathEntry nor .filePath objects.

The problem is that your API's notion that these things exist as 
coherent concepts was never really a valid assumption in the first 
place.  .pth files and namespace packages already meant that the idea 
of a package coming from a single path entry made no sense.  And 
namespace packages installed by setuptools' system packaging mode 
*don't have a __file__ attribute* today...  heck they don't have 
__init__ modules, either.

So, adding virtual packages isn't actually going to change anything, 
except perhaps by making these scenarios more common.



More information about the Python-Dev mailing list