[Import-SIG] PEP 420: Implicit Namespace Packages
Barry Warsaw
barry at python.org
Fri May 4 16:34:50 CEST 2012
On May 04, 2012, at 08:20 AM, Nick Coghlan wrote:
>I'd still prefer to just officially bless the existing "<whatever>"
>convention for non-filesystem imports over encouraging type checks on
>__loader__ or defining a new introspection interface for loaders.
The thing is, that convention is at best meaningless and at worst misleading.
I also don't think it gives you all the diagnosis support you really want.
The PEP 302 rule (reservation of no __file__ only for built-ins) is a
historical relic for which no good rationale exists. Forgetting that for a
moment, it simply makes no sense for a module that wasn't loaded from a file
system path to have an __file__ attribute.
It's also not true even today. At our PEP 420 sprint we noticed importlib
does something like this to create new modules:
>>> type(sys)('foo')
That module isn't a built-in and doesn't have an __file__. It also
doesn't have an __loader__, but oh well.
(BTW, Brett, that's pretty clever. :)
It seemed to us that the only reasonable semantics for such modules is that
__file__ is None or __file__ is missing. Not setting __file__ is better
though because you get appropriate exceptions at the place where you make the
initial mistake (i.e. assuming every module has an __file__). If you set
__file__ to None, you may instead get cryptic messages in os.path.join() for
example.
So, what about the "diagnostics" use case? Certainly a very important use
case is the repr of module objects. In the case of modules loaded from the
file system, I definitely want to know where the file lives, and the repr is a
great way to see that. For other modules, you do want to know something about
how that module was created, and having a repr that gives a good indication of
that is very useful. But you can easily do that without a contrived __file__
(more on that below).
What about other introspection use cases? Relying on __file__
programmatically might be a convenient shorthand, but knowing the loader (via
__loader__ if available) is more helpful, because that tells you more about
how that module actually came into existence.
The value of __file__ is really under the purview of the loader anyway.
Consider a hypothetical database loader (or even many different third party
database loaders). Of what use is an __file__ that says '<database>'? That
way leads to uncertainty, and namespace collisions, for example if both a
SQLite loader and a PostgreSQL loader wanted to use the '<database>' value.
In either case, maybe you'd prefer to know what the database url is, or maybe
the query that produced the module, or some combination there of.
Overloading all that into a contrived __file__ seems wrong.
I would prefer if the requirement were relaxed, and we simply allowed the
loaders to set __file__ to whatever they think is appropriate, which would
include allowing them to not setting __file__ at all.
It's actually easy to give modules a reasonable repr even without __file__. I
have a branch in the PEP 420 feature repo which implements the following rules
for module object reprs:
* Use mod.__file__ if it exists
* Otherwise, get the module's __loader__
* If the module has no loader, then just return the module's name. E.g.
>>> type(sys)('foo')
<module 'foo'>
* Define a new optional method on loaders, called module_repr() that
takes the module as an argument. Use whatever this returns as the
module's repr.
* As a last fallback, just use the repr of the loader as part of the module's
repr.
I'm not particularly married to this implementation, but it seems reasonably
backward compatible, and flexible enough to support useful alternatives. For
example, the BuiltinImporter could define its module_repr() like so:
@classmethod
def module_repr(cls, module):
return '<module {} (built-in)>'.format(module.__name__)
Specifically, my proposed elaboration on PEP 420 is this:
* Explicitly leave the assignment of __file__ to the loader.
* Allow loaders to not set __file__
* Add an optional API to loaders, module_repr() as defined above.
Cheers,
-Barry
More information about the Import-SIG
mailing list