[Python-3000] Changing the import machinery
Fredrik Lundh
fredrik at pythonware.com
Sat Apr 22 10:23:50 CEST 2006
Guido van Rossum wrote:
> > I'm afraid I disagree. PEP 302 actually has some tremendous advantages
> > over a pure objects-on-sys.path approach:
> >
> > * Strings can be put in any configuration file, and used in .pth files
> >
> > * Strings can be put in environment variables (like PYTHONPATH).
> >
> > * Strings can be printed out, with all their contents showing and nothing
> > hidden
> >
> > In short, strings are better for humans.
>
> I think I like this. I wonder if there's a parallel with my preference
> for strings as paths instead of path objects...
And strings as exceptions, and Tcl instead of Python ? ;-)
Sorry, but I don't buy this argument at all. Of course you need a
way to map from external path descriptions (PYTHONPATH, registry
entries, etc) to sys.path contents, but ruling that the things you're
manipulating *inside* a Python program must be strings so you "can
print them out with all their contents showing and nothing hidden"
doesn't strike me as very Pythonic.
The target audience for this is Python programmers, after all, and
Python programmers know how to inspect Python objects -- as long as
they can find them, which isn't the case with today's extended import
design, which *hides* lots of stuff in *separate* semi-secret
registries. If you put all this back on the path, it'll be a lot
easier to find and manipulate.
I could quote the "If the implementation is hard to explain, it's a
bad idea." zen here, but I'll quote Sean McGrath's 20th python zen
instead:
"Things should be as complex as necessary but not more complex."
and offer a "let's get back to the basics and add stuff, instead of
assuming that the status quo is complex and complicated because it has
to be" solution. Here's an outline, off the top of my head:
1. sys.path can contain strings or import handlers
2. Strings work as today; as paths that a builtin import handler
uses to look for packages or modules (flyweight-style).
3. Import handlers are duck-typed objects that implement a
simplified version of the PEP 302 protocol. Handlers map dotted
module paths to resources, where a resource can be a Python
module, a Python package (a module container), or some other
resource. Handlers are responsible for creating and populating
module objects; whatever they return is stored in sys.modules and
bound to the import target.
I'm 50/50 on making the import machinery fully type agnostic; that
is, allowing the import handler to return *any* kind of object
also for ordinary imports. Importing e.g. PIL images and pre-
parsed XML resources and Cheetah templates makes perfect sense
to me.
4. A support library provides the following mechanisms:
- An import handler for builtin/frozen objects (with
corresponding hooks on the C API site, so that apps can
register things to be treated as builtins).
- An import handler for the standard library (*all of
it*, minus site-packages!)
- An import handler for directory names (used for string path
items)
- A registry for path specifier syntaxes
- A parser for external path descriptions, which uses the
registry to map from path components to import handlers
- (possibly) Some helpers for user-provided import handlers
(to be factored out from such handlers, rather than be
designed up front)
5. Remove PTH support, and possibly also a lot of the site-
packages related stuff in site.py. I'm 50/50 on *requiring* code
to specify what non-core libraries they want to use, before they
can import them. I'm also 50/50 on making some additions to the
import statement syntax, to make some operations easier (import
... using handler), but I'll leave that for another post.
This would cleanly address every deployment scenario and custom
importer that I've used with a (near) mininum of core support, and
frankly, I fail to see any potential use case that cannot be handled
by this mechanism (simply because a handler can do *anything* behind
the scenes, without requiring more support from the core import
machinery). And it'll let us remove tons of "hard to explain" code
from the core.
</F>
More information about the Python-3000
mailing list