[Distutils] Freeze and new import architecture

Greg Stein gstein@lyra.org
Sat, 19 Dec 1998 14:20:51 -0800

I think you may have misunderstood some of my points, so I'll try to
clarify below.

John Skaller wrote:
> At 18:15 17/12/98 -0800, Greg Stein wrote:
> >> ...
> >> sys.path not be restricted to path names.
> > sys.path has "strings", and
> >> an associated map of "module finders".  Thus, a sys.path entry could
> >> have a directory name (like now) or .zip file, URL, etc.
> >
> >I would much prefer to see the module finder instances in the sys.path.
>         I agree. But that would be a compatibility problem?

Nope. I pointed out that a default installation would only have strings
in there. A person would not have a compatibility problem until they
chose to start using a special importer. In that case, I don't define it
as a problem because they chose to do so.

Of course, we would need to update various standard modules to improve
their handle of sys.path. But again that is not a compatibility issue.

> >> 1. Finding the module in a specific namespace.
> >> 2. Importing a module of a specific type, once it has been found.
> >
> >I think the separation is bogus.
>         I don't: I'd _like_ to add two things, as a client
> of the system:
>         1) Add a new type. For example, allow .c files to be loaded,
>            by compiling them first.
>         2) Add a new kind of namespace. For example, an FTP server.
>            Or a hook to Trove. Or a private data structure I designed
>            myself.
> (1) has to do with what kinds of things are loaded,
> whereas (2) has to do with where they are.

Yes, it would be nice to do those things. However, I don't see that you
need the separate find/import paradigm to do it.

If your particular importer that you've placed into sys.path wants to
use two steps, then fine. If your importers share functionality using
those steps, then kudos to them.

Note that Python's __import__ hook is a single step(!). The two part
find/load scheme is an artifact of ihooks, not Python itself. And I
disagree that we need to formalize it within Python itself.

I simply maintain that we should have a very simple interface from
Python to any import system. In summary, that is placing importers into
sys.path and invoking a "do_import" method on them. Simple and clean.

> Note that the finder must be able to _fetch_ the data to a place
> that the loader can load it from. If a single place is enough,
> then the two features are orthogonal, and thus each is separately
> amenable to Object Oriented development.
>         So .. it is desirable to build an abstraction in which
> the functionality is separate.

You can build your importers this way, but I don't think we need to
place that mechanism in Python.

Personally, I have issues with the style of "put it into a temporary
location, then import it". It seems subject to race conditions and/or
/tmp hacks.

> >> Regarding 2: the finder currently returns a structure that enables the
> >> correct
> >> importer to be called later on. Importers that we have are for
> >> builtin,
> >> frozen, .py/.pyc/.pyo modules, various dll-importers all hiding behind
> >> the
> >> same interface, PYC-resource importers (mac-only) and PYD-resource
> >> importers
> >> (mac-only).
> >
> >Punt this. Just import the dumb thing in one shot.
>         But you can't: a .dll file is imported by saying dlopen(),
> whereas a .py file is imported by compiling it to a .pyc file which
> is then imported. Etc.

Sorry, I meant "punt the whole structure thing". In my little corner of
the universe, I don't believe we have two steps, so we don't need to
formalize any mechanism for passing state between them. Basically, I see
the state thing as a compensation for introducing the two-step find/load
into Python's single-step import mechnaism.

>         'One shot' implies a single function which is not extensible.

This is just argumentative. My proposal is just as extensible, and I
would maintain that it is simpler for the interpreter, and simpler for
many importers (rather than import-writers needing to deal with the
funky two-step).

> >Take the example of an HTTP-based import. Separating that into *two*
> ...
>         The way I see it, the _finder_ is responsible for
> downloading the file to the local file system, where the
> loader requires it to be. The loader turns these raw bits
> into a module.

As I mentioned before, I (personally) don't like this style. I'd rather
write an importer that loads it straight in from the wire. If it hits
the disk, then it would be *very* transitory. The two-step thing that
returns state structures implies an indeterminite time between those
steps, which I think is wrong.

> >Both of our proposals guarantee that stuff in sys.path are not
> >pathnames. If I insert a "foo.zip" or a
> >"http://host.domain.name/pymodules/", then you certainly dont have
> >pathnames.
> >
> >I believe the biggest issue with my proposal is the fact that the values
> >are no longer strings.
>         That's easy to fix: have a default 'finder' that is used
> if the sys.path entry is a string.

I meant "values are no long [only] strings". Please review my proposal
again; you'll note in the path processing that I tested for a string and
call "old_import" to import things using Python's current string-based


Greg Stein, http://www.lyra.org/