From gmcm@hypernet.com  Thu Feb  3 13:41:29 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 3 Feb 2000 08:41:29 -0500
Subject: [Import-sig] Kick-off
Message-ID: <1262537223-4632885@hypernet.com>

I guess the first order of business is to establish some 
objectives. I see two goals (my version of what happend at dev-
day):

Short-term: Provide a "new architecture import hooks" module 
for the standard library. This would deprecate ihooks and 
friends, and provide developers with a way of learning the new 
architecture.

Long-term: Reform the entire import architecture of Python. 
This affects Python start-up, the semantics of sys.path, and 
the C API to importing.

The model for this is, of course, Greg's imputil.py (Greg, your 
latest version is not yet on your website which still has a 
November version).

If this seems like a reasonable statement, I'll massage and 
expand it into the import SIG's homepage. I'll also try to 
produce legible downloads of the handouts I passed out at the 
dev day session.

- Gordon

From guido@python.org  Thu Feb  3 13:48:56 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 03 Feb 2000 08:48:56 -0500
Subject: [Import-sig] Kick-off
In-Reply-To: Your message of "Thu, 03 Feb 2000 08:41:29 EST."
 <1262537223-4632885@hypernet.com>
References: <1262537223-4632885@hypernet.com>
Message-ID: <200002031348.IAA26846@eric.cnri.reston.va.us>

> I guess the first order of business is to establish some 
> objectives. I see two goals (my version of what happend at dev-
> day):
> 
> Short-term: Provide a "new architecture import hooks" module 
> for the standard library. This would deprecate ihooks and 
> friends, and provide developers with a way of learning the new 
> architecture.
> 
> Long-term: Reform the entire import architecture of Python. 
> This affects Python start-up, the semantics of sys.path, and 
> the C API to importing.
> 
> The model for this is, of course, Greg's imputil.py (Greg, your 
> latest version is not yet on your website which still has a 
> November version).
> 
> If this seems like a reasonable statement, I'll massage and 
> expand it into the import SIG's homepage. I'll also try to 
> produce legible downloads of the handouts I passed out at the 
> dev day session.

One addition: at the devday meeting, Michael Reilly objected to the
notion of deprecating ihooks -- he has been using ihooks successfully
to meet his needs.  I think we should think long and hard about
thowing ihooks out -- it may be that the problem is simply that it's
not well documented (actually, undocumented is better :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Thu Feb  3 13:53:55 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 3 Feb 2000 05:53:55 -0800 (PST)
Subject: [Import-sig] Kick-off
In-Reply-To: <1262537223-4632885@hypernet.com>
Message-ID: <Pine.LNX.4.10.10002030545190.28358-100000@nebula.lyra.org>

On Thu, 3 Feb 2000, Gordon McMillan wrote:
>...
> I guess the first order of business is to establish some 
> objectives. I see two goals (my version of what happend at dev-
> day):
> 
> Short-term: Provide a "new architecture import hooks" module 
> for the standard library. This would deprecate ihooks and 
> friends, and provide developers with a way of learning the new 
> architecture.
> 
> Long-term: Reform the entire import architecture of Python. 
> This affects Python start-up, the semantics of sys.path, and 
> the C API to importing.
> 
> The model for this is, of course, Greg's imputil.py (Greg, your 
> latest version is not yet on your website which still has a 
> November version).

My latest version is available from the CVS repository. The module is
easily accessed via:

  http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py

I haven't posted it to the web site because the web version is the
"public, stable" version. I'll add the above link so that people can get
the "development" version. Specifically, the new version has some
outstanding feedback from MAL and a couple others that I need to handle.
There is also some basic cleanup to do. And dealing with any feedback from
Guido (which was deferred (partially) due to the types-sig).

[ of course, please feel free to link this stuff from the import-sig web
  pages ]

> If this seems like a reasonable statement, I'll massage and 
> expand it into the import SIG's homepage. I'll also try to 
> produce legible downloads of the handouts I passed out at the 
> dev day session.

I'm not quite sure what the "short-term" means/implies, so any summary
from the dev-day session would be great.

For the long-term stuff, I'm fine with your goal statement.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Thu Feb  3 14:02:46 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 3 Feb 2000 06:02:46 -0800 (PST)
Subject: [Import-sig] deprecate ihooks? (was: Kick-off)
In-Reply-To: <200002031348.IAA26846@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002030556280.28358-100000@nebula.lyra.org>

On Thu, 3 Feb 2000, Guido van Rossum wrote:
> Gordon wrote:
>...
> > Short-term: Provide a "new architecture import hooks" module 
> > for the standard library. This would deprecate ihooks and 
> > friends, and provide developers with a way of learning the new 
> > architecture.
>...
> One addition: at the devday meeting, Michael Reilly objected to the
> notion of deprecating ihooks -- he has been using ihooks successfully
> to meet his needs.  I think we should think long and hard about
> thowing ihooks out -- it may be that the problem is simply that it's
> not well documented (actually, undocumented is better :-).

There are a lot of modules that people have used in the past, which are
now deprecated (I count 17 modules in Lib/lib-old). Deprecating a module
is simply signalling an intent to move to a new system (and, hopefully, a
better one). As long as we're improving things, then I don't see a problem
with noting some older stuff should not be used. In other words, there
will sometimes be sacrifices in the name of overall improvement.

ihooks will continue to be available in the lib-old library in the
distribution (or redistributed with the apps that require it).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gmcm@hypernet.com  Thu Feb  3 14:15:19 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 3 Feb 2000 09:15:19 -0500
Subject: [Import-sig] Kick-off
In-Reply-To: <200002031348.IAA26846@eric.cnri.reston.va.us>
References: Your message of "Thu, 03 Feb 2000 08:41:29 EST."             <1262537223-4632885@hypernet.com>
Message-ID: <1262535193-4755007@hypernet.com>

[Gordon]
> > Short-term: Provide a "new architecture import hooks" module 
> > for the standard library. This would deprecate ihooks and 
> > friends, and provide developers with a way of learning the new 
> > architecture.

[Guido]
> One addition: at the devday meeting, Michael Reilly objected to the
> notion of deprecating ihooks -- he has been using ihooks successfully
> to meet his needs.  I think we should think long and hard about
> thowing ihooks out -- it may be that the problem is simply that it's
> not well documented (actually, undocumented is better :-).

I have badgered Michael into joining the SIG, so hopefully we 
can iron this out. I didn't want to pollute the objectives with a 
bunch of issues, but I'll do that now.

Controversies
-------------------

1) Speed (at least when imputil is used as _the_ import 
mechanism, without making use of it's new features).

2) Lack of certain hook features (I'm unclear on this; I *think* 
Michael's complaints fit in here).

3) Lack of package __path__ mechanism.

4) The need to flesh out the ImportManager.

Anything else?

- Gordon

From guido@python.org  Thu Feb  3 14:15:56 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 03 Feb 2000 09:15:56 -0500
Subject: [Import-sig] deprecate ihooks? (was: Kick-off)
In-Reply-To: Your message of "Thu, 03 Feb 2000 06:02:46 PST."
 <Pine.LNX.4.10.10002030556280.28358-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.10002030556280.28358-100000@nebula.lyra.org>
Message-ID: <200002031415.JAA28876@eric.cnri.reston.va.us>

> > One addition: at the devday meeting, Michael Reilly objected to the
> > notion of deprecating ihooks -- he has been using ihooks successfully
> > to meet his needs.  I think we should think long and hard about
> > thowing ihooks out -- it may be that the problem is simply that it's
> > not well documented (actually, undocumented is better :-).
> 
> There are a lot of modules that people have used in the past, which are
> now deprecated (I count 17 modules in Lib/lib-old). Deprecating a module
> is simply signalling an intent to move to a new system (and, hopefully, a
> better one). As long as we're improving things, then I don't see a problem
> with noting some older stuff should not be used. In other words, there
> will sometimes be sacrifices in the name of overall improvement.
> 
> ihooks will continue to be available in the lib-old library in the
> distribution (or redistributed with the apps that require it).

Maybe I didn't explain it well enough.  I think ihooks actually has a
better base architecture than your imputil (except it's missing the
import manager, which is a separate thing anyway).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Fri Feb  4 09:27:51 2000
From: gstein@lyra.org (Greg Stein)
Date: Fri, 4 Feb 2000 01:27:51 -0800 (PST)
Subject: [Import-sig] deprecate ihooks?
In-Reply-To: <200002031415.JAA28876@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002040052460.8462-100000@nebula.lyra.org>

On Thu, 3 Feb 2000, Guido van Rossum wrote:
>...
> Maybe I didn't explain it well enough.  I think ihooks actually has a
> better base architecture than your imputil (except it's missing the
> import manager, which is a separate thing anyway).

Ah! Now we get to it :-)

I'd love to hear any feedback that you have on imputil. I've received some
from a few others that is awaiting incorporation.

How would you like to do this to minimize your review time? Should I fold
in the other feedback first (to eliminate duplicate feedback)? Do you want
to do a quick, high-level feedback? Or go for broke with a fully detailed
review? :-)

It may also be constructive to compare it against the requirements for a
new import mechanism that you set up in:

Initial requirements list:
  http://www.python.org/pipermail/python-dev/1999-November/002867.html

My response, with imputil in mind:
  http://www.python.org/pipermail/python-dev/1999-November/002899.html

Your response and modified requirements:
  http://www.python.org/pipermail/python-dev/1999-December/002973.html

I'll get my Python page updated in a moment with a pointer to the CVS
version of imputil. For reference: http://www.lyra.org/greg/python/

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gmcm@hypernet.com  Fri Feb  4 15:52:20 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 4 Feb 2000 10:52:20 -0500
Subject: [Import-sig] Requirements
Message-ID: <1262442956-256070@hypernet.com>

Going through the links Greg posted to the dev list discussion, 
I find some things that I think need clarification:

[Guido]
> - the core API may be incompatible, as long as
> compatibility layers can be provided in pure Python

From the C API, we have PyImport_Import which does the 
same as (keyword) import. But PyImport_ImportModule and 
...Ex are lower level. I assume that modulo some arg 
munging, these also will do the same as (keyword) import. 
Decent assumption?

[Guido]
> - support for freeze functionality

Heh, heh. The current modulefinder works by (yet again) 
emulating the entire import process, but not letting the 
"imported" code leak out. In imputil, it's the Importer base 
class that does the "leaking", not code in a (well-behaved) 
derived class. So that opens the possibility of replacing the 
Importer object in the derived class's bases with a 
PhonyImporter that doesn't leak. So modulefinder could use 
the derived class and wouldn't have to emulate. However, 
modulefinder would have to report more information - the 
importer that found the module, as well as the 
file/URL/whatever it found it in.

[Guido]
> - sys.path and sys.modules should still exist; sys.path 
> might have a slightly different meaning

and 

> - Standard import from zip or jar files, in two ways:
>  (1) an entry on sys.path can be a zip/jar file instead of a
>  directory; its contents will be searched for modules or
>   packages
>  (2) a file in a directory that's on sys.path can be a zip/jar
>   file;its contents will be considered as a package

It looks like we're very close to this. Maybe already there 
(once a suffix importer has been written for a zip file).

In the current version, items on sys.path can be directory 
names or importer instances. Obviously at startup, sys.path is 
nothing more that strings. Also (per other discussions), 
sys.path starts as a minimal boot path, and gets expanded 
from Python.

What is this mechanism?

Do we worry about:
 Network installations in heterogeneous environments?
 Ditto in homogeneous environments?
 Multiple incompatible installations?

(I vote "yes" on the last two, and "maybe" on the first; mainly 
because the latter two can be solved by figuring a boot path 
based on the location of the executable).

Should the syntax of .pth files be expanded to allow specifying 
importer instances? Or do we use sitecustomize?

Mmph. Enough for now...

- Gordon

From Fredrik Lundh" <effbot@telia.com  Fri Feb  4 18:29:08 2000
From: Fredrik Lundh" <effbot@telia.com (Fredrik Lundh)
Date: Fri, 4 Feb 2000 19:29:08 +0100
Subject: [Import-sig] deprecate ihooks?
Message-ID: <025101bf6f3d$ea552500$f4a7b5d4@hagrid>

Gordon:
> > Short-term: Provide a "new architecture import hooks" module=20
> > for the standard library. This would deprecate ihooks and=20
> > friends, and provide developers with a way of learning the new=20
> > architecture.

Guido:
> One addition: at the devday meeting, Michael Reilly objected to the
> notion of deprecating ihooks -- he has been using ihooks successfully
> to meet his needs.  I think we should think long and hard about
> thowing ihooks out -- it may be that the problem is simply that it's
> not well documented (actually, undocumented is better :-).

Greg:
> There are a lot of modules that people have used in the past, which =
are
> now deprecated (I count 17 modules in Lib/lib-old). Deprecating a =
module
> is simply signalling an intent to move to a new system (and, =
hopefully, a
> better one). As long as we're improving things, then I don't see a =
problem
> with noting some older stuff should not be used. In other words, there
> will sometimes be sacrifices in the name of overall improvement.

we've also used ihooks in a number of places, with great
success.  on the other hand, changing to imputil was hardly
any work at all...

so I guess The Question is whether the find/load separation
is really necessary.  I cannot think of a reason, but that's
probably just me...

cheers /Gredrik (at home)

  "Sometimes, when you are a Bear of Very Little Brain,
  and you Think of Things, you find sometimes that a
  Thing which seemed very Thingish inside you is quite
  different when it gets out into the open and has other
  people looking at it." -- Pooh

From Fredrik Lundh" <effbot@telia.com  Fri Feb  4 18:47:28 2000
From: Fredrik Lundh" <effbot@telia.com (Fredrik Lundh)
Date: Fri, 4 Feb 2000 19:47:28 +0100
Subject: [Import-sig] Kick-off
Message-ID: <025601bf6f40$4aff5220$f4a7b5d4@hagrid>

Gordon wrote:
> Short-term: Provide a "new architecture import hooks" module=20
> for the standard library. This would deprecate ihooks and=20
> friends, and provide developers with a way of learning the new=20
> architecture.

1.6.1 <=3D version <=3D 1.7, right?

> Long-term: Reform the entire import architecture of Python.=20
> This affects Python start-up, the semantics of sys.path, and=20
> the C API to importing.

1.7 <=3D version < 3000, right?

> The model for this is, of course, Greg's imputil.py (Greg, your=20
> latest version is not yet on your website which still has a=20
> November version).

I'd like to add an ultra-short-term issue: possible changes
to 1.6.0 that makes it easier to experiment with alternate
import strategies, mostly for installation tools like gordon's
install and pythonworks' deployment subsystem.

(as discussed on last week's consortium meeting)

most importantly, I'd like to come up with a way to execute
small snippets of script code *before* Python attempts to
import stuff like exceptions.py.

</F>

From guido@python.org  Fri Feb  4 18:48:07 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 04 Feb 2000 13:48:07 -0500
Subject: [Import-sig] Kick-off
In-Reply-To: Your message of "Fri, 04 Feb 2000 19:47:28 +0100."
 <025601bf6f40$4aff5220$f4a7b5d4@hagrid>
References: <025601bf6f40$4aff5220$f4a7b5d4@hagrid>
Message-ID: <200002041848.NAA14651@eric.cnri.reston.va.us>

> I'd like to add an ultra-short-term issue: possible changes
> to 1.6.0 that makes it easier to experiment with alternate
> import strategies, mostly for installation tools like gordon's
> install and pythonworks' deployment subsystem.
> 
> (as discussed on last week's consortium meeting)
> 
> most importantly, I'd like to come up with a way to execute
> small snippets of script code *before* Python attempts to
> import stuff like exceptions.py.

In my notes of the consortium meeting, I have punch a tiny hole in the
interpreter through which /F can drive his truck."

Unfortunately I don't recall where the hole should be punched.  Since
you have an application for this, can you remind me?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From Fredrik Lundh" <effbot@telia.com  Fri Feb  4 19:07:27 2000
From: Fredrik Lundh" <effbot@telia.com (Fredrik Lundh)
Date: Fri, 4 Feb 2000 20:07:27 +0100
Subject: [Import-sig] Kick-off
References: <025601bf6f40$4aff5220$f4a7b5d4@hagrid>  <200002041848.NAA14651@eric.cnri.reston.va.us>
Message-ID: <026801bf6f43$16b7cee0$f4a7b5d4@hagrid>

Guido van Rossum <guido@python.org> wrote:
> > I'd like to add an ultra-short-term issue: possible changes
> > to 1.6.0 that makes it easier to experiment with alternate
> > import strategies
>=20
> In my notes of the consortium meeting, I have punch a tiny hole in the
> interpreter through which /F can drive his truck."
>=20
> Unfortunately I don't recall where the hole should be punched.  Since
> you have an application for this, can you remind me?

I have an idea or two, but I gotta ship that
SRE kit first... (soon)

(just thought that the import-siggers might
come up with some additional ideas while I'm
busy doing that)

</F>

From guido@python.org  Fri Feb  4 19:26:34 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 04 Feb 2000 14:26:34 -0500
Subject: [Import-sig] Requirements
In-Reply-To: Your message of "Fri, 04 Feb 2000 10:52:20 EST."
 <1262442956-256070@hypernet.com>
References: <1262442956-256070@hypernet.com>
Message-ID: <200002041926.OAA14816@eric.cnri.reston.va.us>

> From the C API, we have PyImport_Import which does the 
> same as (keyword) import. But PyImport_ImportModule and 
> ...Ex are lower level. I assume that modulo some arg 
> munging, these also will do the same as (keyword) import. 
> Decent assumption?

I suppose you mean they should do the same in the new design, because
they would only be there for b/w compatibility?  Right now they are
designed to be different -- in particular PyImport_Import() calls
__import__() calls PyImport_ImportModule[Ex]().

Do we want to keep the override-__import__ hook?

Who else uses PyImport_ImportModule[Ex]()?

> [Guido]
> > - support for freeze functionality
> 
> Heh, heh. The current modulefinder works by (yet again) 
> emulating the entire import process, but not letting the 
> "imported" code leak out.

Actually, it uses a wrapper around imp.find_module() that checks for a
few special cases and otherwise hands the query off to
imp.find_module()!  ( I don't understand why there's a special case
for looking in the Windows registry; find_module() should already do
that, too.)

> In imputil, it's the Importer base 
> class that does the "leaking", not code in a (well-behaved) 
> derived class. So that opens the possibility of replacing the 
> Importer object in the derived class's bases with a 
> PhonyImporter that doesn't leak. So modulefinder could use 
> the derived class and wouldn't have to emulate. However, 
> modulefinder would have to report more information - the 
> importer that found the module, as well as the 
> file/URL/whatever it found it in.

I'm afraid you've lost me here.  What does "leaking" refer to?

> [Guido]
> > - sys.path and sys.modules should still exist; sys.path 
> > might have a slightly different meaning
> 
> and 
> 
> > - Standard import from zip or jar files, in two ways:
> >  (1) an entry on sys.path can be a zip/jar file instead of a
> >  directory; its contents will be searched for modules or
> >   packages
> >  (2) a file in a directory that's on sys.path can be a zip/jar
> >   file;its contents will be considered as a package
> 
> It looks like we're very close to this. Maybe already there 
> (once a suffix importer has been written for a zip file).
> 
> In the current version, items on sys.path can be directory 
> names or importer instances. Obviously at startup, sys.path is 
> nothing more that strings. Also (per other discussions), 
> sys.path starts as a minimal boot path, and gets expanded 
> from Python.
> 
> What is this mechanism?

Look at Modules/getpath.c and PC/getpathp.c.  Or are you asking about
how the mechanism should be redesigned?

> Do we worry about:
>  Network installations in heterogeneous environments?

Yes, by supporting sys.exec_prefix.  This has consequences for
getpath.c, see there.  I think this support has lost its significance
with the advent of fast disks, but I'm not going to fight millions of
sysadmins stuck in the past, so we have to continue to support it.
It's no big deal anyway.

>  Ditto in homogeneous environments?

How can you even tell?  Maybe I don't understand what you are talking
about (and then my previous response also doesn't make sense?)

>  Multiple incompatible installations?

Emphatically yes.  A Python binary should be able to find out where
the rest of its installation is.  This is a platform specific problem
(hence getpath.c and PC/getpathp.c).

Note that on Windows there's the added problem of Mark Hammond's COM
support.  COM services implemented by Python can be started on the fly
without starting python.exe, e.g. by embedding such a COM object in a
Word document.  The consequence of this (I've been told) is that the
python15.dll file must live in the system directory (\WinNT, \Windows,
etc.).  This means that its path is useless to find the rest of the
installation, and that's why we're using the registry.

I don't know if all of this is still true; I would think that if a COM
support DLL lives somewhere else, the registry could point to it?  But
who am I to argue with Microsoft.

Anyway I wouldn't mind if this was somehow solved differently; for
example there could be another copy of python15.dll in the Python
install dir which was used normally, and in that case the registry
wouldn't be needed.

> (I vote "yes" on the last two, and "maybe" on the first; mainly 
> because the latter two can be solved by figuring a boot path 
> based on the location of the executable).
> 
> Should the syntax of .pth files be expanded to allow specifying 
> importer instances? Or do we use sitecustomize?

Do you really think that will be used?  There would seem to be a
chicken/egg problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gmcm@hypernet.com  Fri Feb  4 21:29:45 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 4 Feb 2000 16:29:45 -0500
Subject: [Import-sig] Requirements
In-Reply-To: <200002041926.OAA14816@eric.cnri.reston.va.us>
References: Your message of "Fri, 04 Feb 2000 10:52:20 EST."             <1262442956-256070@hypernet.com>
Message-ID: <1262422710-1474030@hypernet.com>

[me]
> > From the C API, we have PyImport_Import which does the 
> > same as (keyword) import. But PyImport_ImportModule and 
> > ...Ex are lower level. I assume that modulo some arg 
> > munging, these also will do the same as (keyword) import. 
> > Decent assumption?
[Guido]
> I suppose you mean they should do the same in the new design, because
> they would only be there for b/w compatibility?  

Yes.

> Right now they are
> designed to be different -- in particular PyImport_Import() calls
> __import__() calls PyImport_ImportModule[Ex]().
> 
> Do we want to keep the override-__import__ hook?

We need a builtin function (so you can use a runtime arg; and 
not be forced to exec). But there's not much sense in making 
it hookable, when the whole import system is a set of hooks.

> Who else uses PyImport_ImportModule[Ex]()?

In my experience, almost all extension writers use 
PyImport_ImportModule, not PyImport_Import. I think this is 
speed-freakism, not for functionality (which could only be to 
avoid hooks).

> > [Guido]
> > > - support for freeze functionality
> > 
> > Heh, heh. The current modulefinder works by (yet again) 
> > emulating the entire import process, but not letting the 
> > "imported" code leak out.

> > In imputil, it's the Importer base 
> > class that does the "leaking", not code in a (well-behaved) 
> > derived class. So that opens the possibility of replacing the 
> > Importer object in the derived class's bases with a 
> > PhonyImporter that doesn't leak. So modulefinder could use 
> > the derived class and wouldn't have to emulate. However, 
> > modulefinder would have to report more information - the 
> > importer that found the module, as well as the 
> > file/URL/whatever it found it in.
> 
> I'm afraid you've lost me here.  What does "leaking" refer to?

Letting the module into sys.modules or any real namespace. 
Just pointing out that a new modulefinder should be able to 
follow the hooks without excessive effort.

[zip files on or in sys.path...]
> > In the current version, items on sys.path can be directory 
> > names or importer instances. Obviously at startup, sys.path is 
> > nothing more that strings. Also (per other discussions), 
> > sys.path starts as a minimal boot path, and gets expanded 
> > from Python.
> > 
> > What is this mechanism?
> 
> Look at Modules/getpath.c and PC/getpathp.c.  Or are you asking about
> how the mechanism should be redesigned?

Yes. "Current version" meant "of imputil". Sorry.

> > Do we worry about:
> >  Network installations in heterogeneous environments?
> 
> Yes, by supporting sys.exec_prefix.  This has consequences for
> getpath.c, see there.  I think this support has lost its significance
> with the advent of fast disks, but I'm not going to fight millions of
> sysadmins stuck in the past, so we have to continue to support it.
> It's no big deal anyway.
> 
> >  Ditto in homogeneous environments?
> 
> How can you even tell?  Maybe I don't understand what you are talking
> about (and then my previous response also doesn't make sense?)

Terms: by "heterogeneous" I meant, eg, a Solaris server with 
Solaris, Windows and Linux clients. By "homogeneous" I 
meant clients (and probably server) are all binary compatible.

I *think* "homogeneous" is more-or-less solved when "multiple 
incompatible installations" is solved.

The added complexity of "heterogeneous" being the plat_xxx 
libraries (and what package authors have to do), which 
appears to be getting deprecated(?).

> >  Multiple incompatible installations?
> 
> Emphatically yes.  A Python binary should be able to find out where
> the rest of its installation is.  This is a platform specific problem
> (hence getpath.c and PC/getpathp.c).

Um, yes and no (to it being platform specific). Yes in that you 
can't follow symlinks on Windows, or easily get the absolute 
path name of the executable in some *nixen. No, in that I feel 
strongly (modulo some of the COM stuff below) that the 
psuedo-code should be the same - just think of distutils and 
package authors!

> Note that on Windows there's the added problem of Mark Hammond's COM
> support.  COM services implemented by Python can be started on the fly
> without starting python.exe, e.g. by embedding such a COM object in a
> Word document.  The consequence of this (I've been told) is that the
> python15.dll file must live in the system directory (\WinNT, \Windows,
> etc.).  This means that its path is useless to find the rest of the
> installation, and that's why we're using the registry.
> 
> I don't know if all of this is still true; I would think that if a COM
> support DLL lives somewhere else, the registry could point to it?  But
> who am I to argue with Microsoft.

I *think* this problem has been solved, and the registry can 
point wherever it wants, but I'm not the expert. If this stuff is 
still needed, perhaps it could be fallback: "Oops, I can't figure 
out PYTHONPATH, so I'll look in the registry". I'll forward this 
question to Mark.

> Anyway I wouldn't mind if this was somehow solved differently; 

Amen.

> > Should the syntax of .pth files be expanded to allow specifying 
> > importer instances? Or do we use sitecustomize?
> 
> Do you really think that will be used?  There would seem to be a
> chicken/egg problem.

Categorizing it doesn't solve it ;-). OK, we don't need a 
concrete solution now, but is this a reasonable approach?

1) Py_Initialize calls getpath.c
2) getpath.c returns a directory (or very short list thereof).
3) Fredrik drives his truck through (me too)
4) A frozen-in exceptions.py gets imported
5) sys.path gets expanded by looking for (something | some 
things) in the existing sys.path and ( executing | reading ) 
them.

Maybe 3 & 4 are swapped?

Maybe some of this is written in Python and frozen in?

The, err, obsession here being to make it (1) highly 
customizable AND (2) generally idiot-resistant. A few simple 
controls under a bright red hatch cover that says "Warning - 
touching this stuff will void your warranty".

- Gordon

From guido@python.org  Fri Feb  4 22:07:04 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 04 Feb 2000 17:07:04 -0500
Subject: [Import-sig] Requirements
In-Reply-To: Your message of "Fri, 04 Feb 2000 16:29:45 EST."
 <1262422710-1474030@hypernet.com>
References: Your message of "Fri, 04 Feb 2000 10:52:20 EST." <1262442956-256070@hypernet.com>
 <1262422710-1474030@hypernet.com>
Message-ID: <200002042207.RAA16458@eric.cnri.reston.va.us>

[Guido]
> > Right now they are
> > designed to be different -- in particular PyImport_Import() calls
> > __import__() calls PyImport_ImportModule[Ex]().
> > 
> > Do we want to keep the override-__import__ hook?
[Gordon]
> We need a builtin function (so you can use a runtime arg; and 
> not be forced to exec). But there's not much sense in making 
> it hookable, when the whole import system is a set of hooks.

Agreed, except for b/w compat.

> > Who else uses PyImport_ImportModule[Ex]()?
> 
> In my experience, almost all extension writers use 
> PyImport_ImportModule, not PyImport_Import. I think this is 
> speed-freakism, not for functionality (which could only be to 
> avoid hooks).

I think that's more likely because for a long time,
PyImport_ImportModule() was the only interface -- PyImport_Import()
was added much later (by Jim F who needed access to the hooked code
from inside cPickle).

> > > Do we worry about:
> > >  Network installations in heterogeneous environments?
> > 
> > Yes, by supporting sys.exec_prefix.  This has consequences for
> > getpath.c, see there.  I think this support has lost its significance
> > with the advent of fast disks, but I'm not going to fight millions of
> > sysadmins stuck in the past, so we have to continue to support it.
> > It's no big deal anyway.
> > 
> > >  Ditto in homogeneous environments?
> > 
> > How can you even tell?  Maybe I don't understand what you are talking
> > about (and then my previous response also doesn't make sense?)
> 
> Terms: by "heterogeneous" I meant, eg, a Solaris server with 
> Solaris, Windows and Linux clients. By "homogeneous" I 
> meant clients (and probably server) are all binary compatible.

OK, I knew that.

> I *think* "homogeneous" is more-or-less solved when "multiple 
> incompatible installations" is solved.

I don't see any problems with homogeneous environments -- what
possible problem could there be (that doesn't exist when there's no
sharing and that isn't caused by multiple versions)?

> The added complexity of "heterogeneous" being the plat_xxx 
> libraries (and what package authors have to do), which 
> appears to be getting deprecated(?).
> 
> > >  Multiple incompatible installations?
> > 
> > Emphatically yes.  A Python binary should be able to find out where
> > the rest of its installation is.  This is a platform specific problem
> > (hence getpath.c and PC/getpathp.c).
> 
> Um, yes and no (to it being platform specific). Yes in that you 
> can't follow symlinks on Windows, or easily get the absolute 
> path name of the executable in some *nixen. No, in that I feel 
> strongly (modulo some of the COM stuff below) that the 
> psuedo-code should be the same - just think of distutils and 
> package authors!

The pseudo code is also different because the structure of
site-packages etc. is different on Windows (it doesn't exist).

But I agree that it's a shame that there are two copies of code with
very similar functionality, and I'd gladly get rid of one.  (There's
even a third copy, in the os2 subdirectory!)

> > > Should the syntax of .pth files be expanded to allow specifying 
> > > importer instances? Or do we use sitecustomize?
> > 
> > Do you really think that will be used?  There would seem to be a
> > chicken/egg problem.
> 
> Categorizing it doesn't solve it ;-). OK, we don't need a 
> concrete solution now, but is this a reasonable approach?
> 
> 1) Py_Initialize calls getpath.c
> 2) getpath.c returns a directory (or very short list thereof).
> 3) Fredrik drives his truck through (me too)
> 4) A frozen-in exceptions.py gets imported
> 5) sys.path gets expanded by looking for (something | some 
> things) in the existing sys.path and ( executing | reading ) 
> them.
> 
> Maybe 3 & 4 are swapped?

Yes, better.

> Maybe some of this is written in Python and frozen in?

Possibly.

> The, err, obsession here being to make it (1) highly 
> customizable AND (2) generally idiot-resistant. A few simple 
> controls under a bright red hatch cover that says "Warning - 
> touching this stuff will void your warranty".

Good metaphor.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gmcm@hypernet.com  Fri Feb  4 23:32:17 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 4 Feb 2000 18:32:17 -0500
Subject: [Import-sig] RE: pythonpath and COM support
In-Reply-To: <ECEPKNMJLHAPFFJHDOJBGECLCEAA.mhammond@skippinet.com.au>
References: <1262422703-1474374@hypernet.com>
Message-ID: <1262415358-1916426@hypernet.com>

Mark,

[also posting this to the import-SIG]

> I believe that we could drop putting python1x.dll in the system directory,
> but this would have the following implications:
> 
> * All Python executables would need to exist in the same directory as the
> .dll.  Python.exe and Pythonw.exe already would, but "3rd party"
> executables, such as Pythonwin.exe or any other .exes supplied by extension
> authors would also need to live in that directory.  Ditto for other DLL's,
> such as pythoncom15.dll, pywintypes15.dll, etc.

Couldn't pythoncom15.dll etc. live in /DLLs? Or are they 
loaded by other C extensions (using LoadLibrary instead of 
PyImport_x)?

> * The path searching code would need to use the location of Python1x.dll,
> rather than the .exe, to locate the PYTHONHOME.  This would not be a huge
> change, but necessary none-the-less.

Um, why? Especially if they're the same directory ;-)?

Oh, because of COM (and exposing python15.dll as a COM 
server)?

> I think this would definately be a win, and would be happy to help make this
> happen.

At the very least, would get around the must-have-admin-rights 
on NT problem.

Thanks,

- Gordon

From mhammond@skippinet.com.au  Fri Feb  4 23:44:38 2000
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Sat, 5 Feb 2000 10:44:38 +1100
Subject: [Import-sig] RE: pythonpath and COM support
In-Reply-To: <1262415358-1916426@hypernet.com>
Message-ID: <ECEPKNMJLHAPFFJHDOJBOEDACEAA.mhammond@skippinet.com.au>

> [also posting this to the import-SIG]

Sheesh - too many new sigs.  Im not on that one, so please CC me where
appropriate.

> Couldn't pythoncom15.dll etc. live in /DLLs? Or are they
> loaded by other C extensions (using LoadLibrary instead of
> PyImport_x)?

pythoncom15.dll and pywintypes15.dll (infact, anything I release with the
extension ".dll") is used as a "standard DLL".  There exists the possibility
that a .exe will have an implicit reference to one of these .DLLs.  So
unless they are in the same path as the executables, or on the %PATH%, they
will not be found.

> > * The path searching code would need to use the location of
> Python1x.dll,
> > rather than the .exe, to locate the PYTHONHOME.  This would not
> be a huge
> > change, but necessary none-the-less.
>
> Um, why? Especially if they're the same directory ;-)?
>
> Oh, because of COM (and exposing python15.dll as a COM
> server)?

Exactly - when a Python COM object is being used, the .exe may well be
something created by VB, and no where near the Python directory.  Thus, the
full path to the .exe will be useless, but the full path to Python1x.dll
will be OK.

> At the very least, would get around the must-have-admin-rights
> on NT problem.

The admin problem is due to writing the registry, rather than copying
something to the system32 directory.  At the moment, the installation
package writes 2 classes of information:

* Core Python stuff - pythonpath etc.
* Information for other installers and IDEs

The 2nd category includes stuff like the "Start Menu" group the user
selected, and the path where Python was installed.  Later installers (such
as win32all) can read this information and avoid asking the user the same
questions - thereby making the installation process more robust.  Another
example is "help files" - eg, a list of all documentation installed by
Python or extensions, thereby allowing IDEs to be smart.

If we can drop the registry all together for the first category, then it
would seem a shame to keep using the registry just for the 2nd category.
OTOH, I think the information made available by the 2nd category is
valuable.

Back-to-ini-files-ly,

Mark.

From mal@lemburg.com  Fri Feb  4 20:10:54 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 04 Feb 2000 21:10:54 +0100
Subject: [Import-sig] deprecate ihooks?
References: <025101bf6f3d$ea552500$f4a7b5d4@hagrid>
Message-ID: <389B324E.E779DB0F@lemburg.com>

This is a multi-part message in MIME format.
--------------6726D2007785ACB228CD285A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Just to throw some old 2 cents, I've attached some code I wrote
way back in 1997 on top of ihooks.py. It turns modules into
real classes with all the goodies of __getattr__ et al.
at no extra cost.

Perhaps this mechanism offers some new insights: by delegating
work to the objects in question (the modules) rather than
hooking together some meta objects... note that you can do
subclassing to add functionality to modules using this approach,
e.g. packages could be subclasses of a general package class, etc.

Anyway, just a thought you might want to consider... I'm too busy
right now to jump into this discussion again ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/
--------------6726D2007785ACB228CD285A
Content-Type: text/python; charset=us-ascii;
 name="ClassModules.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ClassModules.py"

#!/usr/local/bin/python

""" Module-Class-Importer for Python (Version 0.6)

    Modules in Python behave almost like classes,
    but do not provide the same mechanisms, like inheritance,
    baseclasses, special methods, etc.

    This module provides an alternative module loader, that
    is build on top of the ihooks.py-interface for the
    builtin import statement. It works in a similar way,
    the normal import does, but provides some extra features:

    * when a module is requested, an instance of the
      Module-class (or some subclass of Module) is created
      and the actions 'find' and 'load' are redirected to
      this instance via method calls
    * after loading, a call to install_module copies all the
      attributes from the "real" module object to the Module
      instance (which costs some memory, but increases lookup speed),
      thereby making it behave just like the original
    * a reference to the original module object is kept,
      so that 'from...import...' also works (since this statement
      needs a real module object)
    * whenever a module is referenced, the Module object is
      returned, if possible, so even after having done 'from x import y'
      at some point, 'import x' will return the Module object,
      so hopefully all references to a module made by a Python
      program should return the Module object, with all its
      nice advantages (like catching AttributeErrors)
    * Module provides a basic skeleton -- you can subclass
      it and then give the ModuleClassImporter your class to use
      (LazyModule is an example for this), if you don't like
      some things, like copying attributes (e.g. use __getattr__
      to redirect the lookup)

    This module contains all necessary base classes (working ones,
    not simply a bare framework), some Loaders, and
    of course, the LazyModule which started this whole thing in
    the first place.

    For more information on how importing works, see ihooks.py
    and ni.py.

    ----------------------------------------------------------------

    Example of usage: Lazy Import for Python (see LazyImp.py)

    ---------------------------------------------------------------

    History:
    - 0.6: fixed for Python 1.5

    Bugs: 
    - none, only unsupported features :-)
    - I have tested it with Tkinter and a 10.000 line framework,
      but of course... there may still be some imports out there,
      I haven't taken into account yet.

    (c) Marc-Andre Lemburg; all rights reserved

"""

__version__ = '0.6'

import sys,ihooks,imp,os

# so that it also works under Python 1.4
try:
    __debug__
except:
    __debug__ = 0

#
# A fast ModuleLoader
#

class FastModuleLoader(ihooks.ModuleLoader):

    """ works like ModuleLoader, but uses imp's find_module, which makes
        it somewhat faster

        * note: file system hooks won't work here !!!
    """
    def find_module(self,name,path=None): 
	m = self.find_builtin_module(name)
	if m: return m
	if path is None: path = sys.path
	return imp.find_module(name,path)

#
# A preprocessing loader
#
# (parts taken from py_compile.py)

import marshal

def clong(x):
    """ return the 4-byte long x as 4-byte string """
    return chr(x&0xff)+chr((x>>8)&0xff)+chr((x>>16)&0xff)+chr((x>>24)&0xff)

class PreProcessingLoader(FastModuleLoader):

    """ do some preprocessing when importing a module, that
        has to be compiled first, i.e. is read in as source file
	* leaves the rest to FastModuleLoader 
    """

    def load_module(self, name, stuff):

	""" load the module name using stuff """

	file, filename, (suff, mode, type) = stuff
	# check if there already is a properly compiled version
	pass
	# if we have to handle a source file...
	if type == imp.PY_SOURCE:
	    # read file
	    program = file.read()
	    # process program
	    program = self.preprocess(program)
	    # compile and try to write the .pyc-file (copied from py_compile.py)
	    code = compile(program, filename, 'exec')
	    codefilename = filename + (__debug__ and 'c' or 'o')
	    try:
		fc = open(codefilename,'wb')
		fc.write(imp.get_magic())
		timestamp = long(os.stat(filename)[8])
		fc.write(clong(timestamp))
		marshal.dump(code,fc)
		fc.close()
		if os.name == 'mac':
		    import macfs
		    macfs.FSSpec(codefilename).SetCreatorType('Pyth', 'PYC ')
		    macfs.FSSpec(filename).SetCreatorType('Pyth', 'TEXT')
	    except IOError:
		pass
	else:
	    return FastModuleLoader.load_module(self, name, stuff)
	# register and initialize module
	m = self.hooks.add_module(name)
	m.__file__ = filename
	exec code in m.__dict__
	return m

    def preprocess(self,program):

	""" do something with the code in program and return
	    the modified string
	"""
	program = "The_PreProcessingLoader_was_here = ':-)'\n" + program
	return program

#
# The Module base class
#

class InternalVars: # container class
    pass

class Module:

    """ The module-works-as-a-class base class

        * this class is instantiated for every new module loaded
	  by the SimulateImport mechanism
	* you can subclass the class to add functionality and
	  pass the subclass to SimulateImport for it to be used
	* important: local variables should always reside in
	  self.__moduleobj__, not in self directly (to avoid name
	  clashes)
	* note: module initialization is done in the usual way, the
	  modules namespaces then copied to this object
	* this class emulates the normal import-operation  
    """
    def __init__(self,name,loader,fromlist=None):

	""" a module name is requested 

	    * this method should NOT be overridden, instead override
	      startup() which is called, when this method finishes
	"""
	self.__moduleobj__ = m = InternalVars()
	self.__name__ = name
	m.loader = loader
	m.fromlist = fromlist
	m.found = 0
	m.loaded = 0
	m.modules = loader.modules_dict()
	m.self = self
	m.module = None # gets filled by load_module()
	self.startup()

    def startup(self):

	""" module startup

	    * called when a module is requested
	"""
	self.find_module()
	self.load_module()

    def real_module(self):

	""" return a real module object """

	return self.__moduleobj__.module

    def register(self):

	""" makes an entry in modules pointing to this object 
	    * loading a module through the loader normally also
	      registers the module, so a call to this method is
	      not needed
	    * note: if you want to do 'from..import..' with
	      this module later on, the registering MUST be done
	      by loader
	"""
	self.__moduleobj__.modules[self.__name__] = self

    def find_module(self):

	""" find the module """

	m = self.__moduleobj__
	m.stuff = m.loader.find_module(self.__name__)
	if not m.stuff: 
	    raise ImportError, 'Module: No module named %s'%name
	m.found = 1

    def load_module(self):

	""" load the module and initialize it

	    * the module must already be found
	    * uses __moduleobj__.loader for loading
	    * calls .install_module to complete the job
	"""

	m = self.__moduleobj__
	if m.loaded: return
	if not m.found:
	    raise ImportError, 'Module: call %s.find_module() first'%self.__name__
	else:
	    module = m.loader.load_module(self.__name__,m.stuff)
	self.install_module(module)
	m.loaded = 1

    def install_module(self,module):

	""" install the module in this objects namespace 

	    * must be called after a module is loaded
	"""

	# keep a reference to the original
	self.__moduleobj__.module = module
	# copy all module attributes to this object
	for k,v in module.__dict__.items():
	    setattr(self,k,v)
	# create a reference in the real module object
	setattr(module,'__moduleobj__',self.__moduleobj__)

    def __repr__(self):

	""" return some meaningful string describing self """

	if self.__moduleobj__.loaded: 
	    return "<%s '%s'>"%(self.__class__.__name__,self.__name__)
	elif self.__moduleobj__.found:
	    return "<%s '%s', loading deferred>"%(self.__class__.__name__,self.__name__)
	else:
	    return "<%s '%s', finding deferred>"%(self.__class__.__name__,self.__name__)

    __str__ = __repr__

    def __getattr__(self,x):

	""" some unknown attribute is being requested """

	raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x)

#
# The module-as-class importer
#

# ModuleImporter to be used:
ImporterBaseClass = ihooks.ModuleImporter

class ModuleClassImporter(ImporterBaseClass):

    """ Module importer, that knows how to handle Module-objects correctly
    """

    def __init__(self,module_class,*importer_class_init):

	""" import modules by encapsulating them in an instance of
	    module_class
	    * modules_class must be a subclass of Module
	    * the other parameters are passed to the ImportClass
	      (see ihooks.py for details)
	"""    
	apply(ImporterBaseClass.__init__,(self,)+importer_class_init)
	self.module_class = module_class

    def import_module(self, name, globals={}, locals={}, fromlist=None):

	""" module import hook
        """
	if self.modules.has_key(name): # fast path
            m = self.modules[name]
	    # return the object, if possible
	    if fromlist is None:
		#print 'Importer: import',name,'(found in sys.modules)',m
		try:
		    return m.__moduleobj__.self
		except:
		    return m
	    else:
		# from..import.. insists on having the real thing !
		#print 'Importer: from',name,'import',fromlist,'(found in sys.modules)',m
		try:
		    return m.__moduleobj__.module
		except:
		    return m
	else:
	    if fromlist is None:
		# normal 'import modulename'
		#print 'Importer: import "%s" with %s'%(name,self.module_class.__name__)
		module = apply(self.module_class,(name,self.loader,fromlist))
	    else:
		# emulate 'from modulename import something'
		# (note: this a hack... and not a nice one !)
		#print 'Importer: from',name,'import',fromlist,'with',self.module_class.__name__
		module = apply(self.module_class,(name,self.loader,fromlist))
		# module has to be loaded for this to work
		module.load_module()
		module = module.real_module()
	    #print 'Importer: %s returned %s'%(self.module_class.__name__,module)
	    return module

--------------6726D2007785ACB228CD285A
Content-Type: text/python; charset=us-ascii;
 name="LazyImp.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="LazyImp.py"

#!/usr/local/bin/python

"""
    Lazy Import for Python (Version 0.6)

    Loads modules only if they are needed and referenced.
    This is done by overloading the builtin 'import'
    statement, so no code changes are necessary.
    Everything should work as normal, except that the
    actual loading process is deferred until a module's
    attribute is requested (you have to keep in mind,
    that this can cause exceptions from the module 
    initialization process -- use lazyimp after debugging !)

    * depends on the module ClassModules.py

    *** Importing this module autoinstalls the Lazy Import Feature.
    *** All subsequent imports will be done lazy.
    *** If you don't like this, comment out the last line !

    For more information, see the LazyModule-doc string below.

    ---------------------------------------------------------------

    History:
    - 0.6: fixed for Python 1.5

    Bugs: 
    - none, only unsupported features :-)
    - I have tested it with Tkinter and a 10.000 line framework,
      but of course... there may still be some imports out there,
      I haven't taken into account yet.

    (c) Marc-Andre Lemburg; all rights reserved

"""

__version__ = '0.6'

import sys,ihooks,imp,os
from ClassModules import *

# base class to be used:
LazyModuleBaseClass = Module

class LazyModule(LazyModuleBaseClass):

    """ Lazy Import for Python

    Loads modules only if they are needed and referenced.
    This is done by overloading the builtin 'import'
    statement, so no code changes are necessary.
    Everything should work as normal, except that the
    actual loading process is deferred until a module's
    attribute is requested (you have to keep in mind,
    that this can cause exceptions from the module 
    initialization process -- use lazyimp after debugging !)

    Hints:
    - you can call the method load_module() of a lazy module
      to force loading of the module (or simply reference
      some attribute)

    Caveats:
    - attributes like __dict__ and __name__, that are provided
      by the LazyImport-class, do not cause loading
    - due to a Python internal limitation, from ... import ...
      is not handled in a lazy fashion (wouldn't be too efficient anyway)
    - debugging circular imports can become an even harder task
      (uncomment the #print-statements to see what's going on)
    """  

    # finding the module is normally done when the object is created
    # -- setting this to 1 defers finding too
    __defer_find = 0

    def startup(self):

	""" lazy import module
	"""
	self.__moduleobj__.defer_find = self.__defer_find
	if not self.__moduleobj__.defer_find: 
	    self.find_module()
	self.register()

    def load_module(self,cause='*'):

	""" do the actual import

	    * this can cause ImportErrors and raise exceptions, that
	      must be handled by the caller, i.e. the first reference
	      to a module might raise an exception !
	    * modules are only loaded once; any subsequent calls to this method
	      are silently ignored (i.e. ImportErrors are only raised
	      the first time, this method is used)
	"""
	if self.__moduleobj__.loaded: return
	#print 'LazyModule: loading module "%s", looking for "%s" ...'%(self.__name__,cause)
	if self.__moduleobj__.defer_find: 
	    # find now
	    self.find_module()
	# let the base class handle the rest
	LazyModuleBaseClass.load_module(self)

	#print 'LazyModule: module "%s" loaded'%self.__name__

    def __getattr__(self,x):

	""" the module's needed, so load it and return the
	    requested attribute afterwards
        """
	#print self.__name__,'is looking for',x
	if not self.__moduleobj__.loaded:
	    self.load_module(x)
	    return getattr(self,x)
	else:
	    raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x)

def autoinstall():

    """ install the Lazy Module Import feature """

    mloader = FastModuleLoader()
    mhandler = LazyModule
    newimport = ModuleClassImporter(mhandler,mloader)
    newimport.install()

#
# auto-install as new 'import' (comment out, if you don't like this)
#
autoinstall()

--------------6726D2007785ACB228CD285A--

From mal@lemburg.com  Fri Feb  4 20:10:54 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 04 Feb 2000 21:10:54 +0100
Subject: [Import-sig] deprecate ihooks?
References: <025101bf6f3d$ea552500$f4a7b5d4@hagrid>
Message-ID: <389B324E.E779DB0F@lemburg.com>

This is a multi-part message in MIME format.
--------------6726D2007785ACB228CD285A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Just to throw some old 2 cents, I've attached some code I wrote
way back in 1997 on top of ihooks.py. It turns modules into
real classes with all the goodies of __getattr__ et al.
at no extra cost.

Perhaps this mechanism offers some new insights: by delegating
work to the objects in question (the modules) rather than
hooking together some meta objects... note that you can do
subclassing to add functionality to modules using this approach,
e.g. packages could be subclasses of a general package class, etc.

Anyway, just a thought you might want to consider... I'm too busy
right now to jump into this discussion again ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/
--------------6726D2007785ACB228CD285A
Content-Type: text/python; charset=us-ascii;
 name="ClassModules.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ClassModules.py"

#!/usr/local/bin/python

""" Module-Class-Importer for Python (Version 0.6)

    Modules in Python behave almost like classes,
    but do not provide the same mechanisms, like inheritance,
    baseclasses, special methods, etc.

    This module provides an alternative module loader, that
    is build on top of the ihooks.py-interface for the
    builtin import statement. It works in a similar way,
    the normal import does, but provides some extra features:

    * when a module is requested, an instance of the
      Module-class (or some subclass of Module) is created
      and the actions 'find' and 'load' are redirected to
      this instance via method calls
    * after loading, a call to install_module copies all the
      attributes from the "real" module object to the Module
      instance (which costs some memory, but increases lookup speed),
      thereby making it behave just like the original
    * a reference to the original module object is kept,
      so that 'from...import...' also works (since this statement
      needs a real module object)
    * whenever a module is referenced, the Module object is
      returned, if possible, so even after having done 'from x import y'
      at some point, 'import x' will return the Module object,
      so hopefully all references to a module made by a Python
      program should return the Module object, with all its
      nice advantages (like catching AttributeErrors)
    * Module provides a basic skeleton -- you can subclass
      it and then give the ModuleClassImporter your class to use
      (LazyModule is an example for this), if you don't like
      some things, like copying attributes (e.g. use __getattr__
      to redirect the lookup)

    This module contains all necessary base classes (working ones,
    not simply a bare framework), some Loaders, and
    of course, the LazyModule which started this whole thing in
    the first place.

    For more information on how importing works, see ihooks.py
    and ni.py.

    ----------------------------------------------------------------

    Example of usage: Lazy Import for Python (see LazyImp.py)

    ---------------------------------------------------------------

    History:
    - 0.6: fixed for Python 1.5

    Bugs: 
    - none, only unsupported features :-)
    - I have tested it with Tkinter and a 10.000 line framework,
      but of course... there may still be some imports out there,
      I haven't taken into account yet.

    (c) Marc-Andre Lemburg; all rights reserved

"""

__version__ = '0.6'

import sys,ihooks,imp,os

# so that it also works under Python 1.4
try:
    __debug__
except:
    __debug__ = 0

#
# A fast ModuleLoader
#

class FastModuleLoader(ihooks.ModuleLoader):

    """ works like ModuleLoader, but uses imp's find_module, which makes
        it somewhat faster

        * note: file system hooks won't work here !!!
    """
    def find_module(self,name,path=None): 
	m = self.find_builtin_module(name)
	if m: return m
	if path is None: path = sys.path
	return imp.find_module(name,path)

#
# A preprocessing loader
#
# (parts taken from py_compile.py)

import marshal

def clong(x):
    """ return the 4-byte long x as 4-byte string """
    return chr(x&0xff)+chr((x>>8)&0xff)+chr((x>>16)&0xff)+chr((x>>24)&0xff)

class PreProcessingLoader(FastModuleLoader):

    """ do some preprocessing when importing a module, that
        has to be compiled first, i.e. is read in as source file
	* leaves the rest to FastModuleLoader 
    """

    def load_module(self, name, stuff):

	""" load the module name using stuff """

	file, filename, (suff, mode, type) = stuff
	# check if there already is a properly compiled version
	pass
	# if we have to handle a source file...
	if type == imp.PY_SOURCE:
	    # read file
	    program = file.read()
	    # process program
	    program = self.preprocess(program)
	    # compile and try to write the .pyc-file (copied from py_compile.py)
	    code = compile(program, filename, 'exec')
	    codefilename = filename + (__debug__ and 'c' or 'o')
	    try:
		fc = open(codefilename,'wb')
		fc.write(imp.get_magic())
		timestamp = long(os.stat(filename)[8])
		fc.write(clong(timestamp))
		marshal.dump(code,fc)
		fc.close()
		if os.name == 'mac':
		    import macfs
		    macfs.FSSpec(codefilename).SetCreatorType('Pyth', 'PYC ')
		    macfs.FSSpec(filename).SetCreatorType('Pyth', 'TEXT')
	    except IOError:
		pass
	else:
	    return FastModuleLoader.load_module(self, name, stuff)
	# register and initialize module
	m = self.hooks.add_module(name)
	m.__file__ = filename
	exec code in m.__dict__
	return m

    def preprocess(self,program):

	""" do something with the code in program and return
	    the modified string
	"""
	program = "The_PreProcessingLoader_was_here = ':-)'\n" + program
	return program

#
# The Module base class
#

class InternalVars: # container class
    pass

class Module:

    """ The module-works-as-a-class base class

        * this class is instantiated for every new module loaded
	  by the SimulateImport mechanism
	* you can subclass the class to add functionality and
	  pass the subclass to SimulateImport for it to be used
	* important: local variables should always reside in
	  self.__moduleobj__, not in self directly (to avoid name
	  clashes)
	* note: module initialization is done in the usual way, the
	  modules namespaces then copied to this object
	* this class emulates the normal import-operation  
    """
    def __init__(self,name,loader,fromlist=None):

	""" a module name is requested 

	    * this method should NOT be overridden, instead override
	      startup() which is called, when this method finishes
	"""
	self.__moduleobj__ = m = InternalVars()
	self.__name__ = name
	m.loader = loader
	m.fromlist = fromlist
	m.found = 0
	m.loaded = 0
	m.modules = loader.modules_dict()
	m.self = self
	m.module = None # gets filled by load_module()
	self.startup()

    def startup(self):

	""" module startup

	    * called when a module is requested
	"""
	self.find_module()
	self.load_module()

    def real_module(self):

	""" return a real module object """

	return self.__moduleobj__.module

    def register(self):

	""" makes an entry in modules pointing to this object 
	    * loading a module through the loader normally also
	      registers the module, so a call to this method is
	      not needed
	    * note: if you want to do 'from..import..' with
	      this module later on, the registering MUST be done
	      by loader
	"""
	self.__moduleobj__.modules[self.__name__] = self

    def find_module(self):

	""" find the module """

	m = self.__moduleobj__
	m.stuff = m.loader.find_module(self.__name__)
	if not m.stuff: 
	    raise ImportError, 'Module: No module named %s'%name
	m.found = 1

    def load_module(self):

	""" load the module and initialize it

	    * the module must already be found
	    * uses __moduleobj__.loader for loading
	    * calls .install_module to complete the job
	"""

	m = self.__moduleobj__
	if m.loaded: return
	if not m.found:
	    raise ImportError, 'Module: call %s.find_module() first'%self.__name__
	else:
	    module = m.loader.load_module(self.__name__,m.stuff)
	self.install_module(module)
	m.loaded = 1

    def install_module(self,module):

	""" install the module in this objects namespace 

	    * must be called after a module is loaded
	"""

	# keep a reference to the original
	self.__moduleobj__.module = module
	# copy all module attributes to this object
	for k,v in module.__dict__.items():
	    setattr(self,k,v)
	# create a reference in the real module object
	setattr(module,'__moduleobj__',self.__moduleobj__)

    def __repr__(self):

	""" return some meaningful string describing self """

	if self.__moduleobj__.loaded: 
	    return "<%s '%s'>"%(self.__class__.__name__,self.__name__)
	elif self.__moduleobj__.found:
	    return "<%s '%s', loading deferred>"%(self.__class__.__name__,self.__name__)
	else:
	    return "<%s '%s', finding deferred>"%(self.__class__.__name__,self.__name__)

    __str__ = __repr__

    def __getattr__(self,x):

	""" some unknown attribute is being requested """

	raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x)

#
# The module-as-class importer
#

# ModuleImporter to be used:
ImporterBaseClass = ihooks.ModuleImporter

class ModuleClassImporter(ImporterBaseClass):

    """ Module importer, that knows how to handle Module-objects correctly
    """

    def __init__(self,module_class,*importer_class_init):

	""" import modules by encapsulating them in an instance of
	    module_class
	    * modules_class must be a subclass of Module
	    * the other parameters are passed to the ImportClass
	      (see ihooks.py for details)
	"""    
	apply(ImporterBaseClass.__init__,(self,)+importer_class_init)
	self.module_class = module_class

    def import_module(self, name, globals={}, locals={}, fromlist=None):

	""" module import hook
        """
	if self.modules.has_key(name): # fast path
            m = self.modules[name]
	    # return the object, if possible
	    if fromlist is None:
		#print 'Importer: import',name,'(found in sys.modules)',m
		try:
		    return m.__moduleobj__.self
		except:
		    return m
	    else:
		# from..import.. insists on having the real thing !
		#print 'Importer: from',name,'import',fromlist,'(found in sys.modules)',m
		try:
		    return m.__moduleobj__.module
		except:
		    return m
	else:
	    if fromlist is None:
		# normal 'import modulename'
		#print 'Importer: import "%s" with %s'%(name,self.module_class.__name__)
		module = apply(self.module_class,(name,self.loader,fromlist))
	    else:
		# emulate 'from modulename import something'
		# (note: this a hack... and not a nice one !)
		#print 'Importer: from',name,'import',fromlist,'with',self.module_class.__name__
		module = apply(self.module_class,(name,self.loader,fromlist))
		# module has to be loaded for this to work
		module.load_module()
		module = module.real_module()
	    #print 'Importer: %s returned %s'%(self.module_class.__name__,module)
	    return module

--------------6726D2007785ACB228CD285A
Content-Type: text/python; charset=us-ascii;
 name="LazyImp.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="LazyImp.py"

#!/usr/local/bin/python

"""
    Lazy Import for Python (Version 0.6)

    Loads modules only if they are needed and referenced.
    This is done by overloading the builtin 'import'
    statement, so no code changes are necessary.
    Everything should work as normal, except that the
    actual loading process is deferred until a module's
    attribute is requested (you have to keep in mind,
    that this can cause exceptions from the module 
    initialization process -- use lazyimp after debugging !)

    * depends on the module ClassModules.py

    *** Importing this module autoinstalls the Lazy Import Feature.
    *** All subsequent imports will be done lazy.
    *** If you don't like this, comment out the last line !

    For more information, see the LazyModule-doc string below.

    ---------------------------------------------------------------

    History:
    - 0.6: fixed for Python 1.5

    Bugs: 
    - none, only unsupported features :-)
    - I have tested it with Tkinter and a 10.000 line framework,
      but of course... there may still be some imports out there,
      I haven't taken into account yet.

    (c) Marc-Andre Lemburg; all rights reserved

"""

__version__ = '0.6'

import sys,ihooks,imp,os
from ClassModules import *

# base class to be used:
LazyModuleBaseClass = Module

class LazyModule(LazyModuleBaseClass):

    """ Lazy Import for Python

    Loads modules only if they are needed and referenced.
    This is done by overloading the builtin 'import'
    statement, so no code changes are necessary.
    Everything should work as normal, except that the
    actual loading process is deferred until a module's
    attribute is requested (you have to keep in mind,
    that this can cause exceptions from the module 
    initialization process -- use lazyimp after debugging !)

    Hints:
    - you can call the method load_module() of a lazy module
      to force loading of the module (or simply reference
      some attribute)

    Caveats:
    - attributes like __dict__ and __name__, that are provided
      by the LazyImport-class, do not cause loading
    - due to a Python internal limitation, from ... import ...
      is not handled in a lazy fashion (wouldn't be too efficient anyway)
    - debugging circular imports can become an even harder task
      (uncomment the #print-statements to see what's going on)
    """  

    # finding the module is normally done when the object is created
    # -- setting this to 1 defers finding too
    __defer_find = 0

    def startup(self):

	""" lazy import module
	"""
	self.__moduleobj__.defer_find = self.__defer_find
	if not self.__moduleobj__.defer_find: 
	    self.find_module()
	self.register()

    def load_module(self,cause='*'):

	""" do the actual import

	    * this can cause ImportErrors and raise exceptions, that
	      must be handled by the caller, i.e. the first reference
	      to a module might raise an exception !
	    * modules are only loaded once; any subsequent calls to this method
	      are silently ignored (i.e. ImportErrors are only raised
	      the first time, this method is used)
	"""
	if self.__moduleobj__.loaded: return
	#print 'LazyModule: loading module "%s", looking for "%s" ...'%(self.__name__,cause)
	if self.__moduleobj__.defer_find: 
	    # find now
	    self.find_module()
	# let the base class handle the rest
	LazyModuleBaseClass.load_module(self)

	#print 'LazyModule: module "%s" loaded'%self.__name__

    def __getattr__(self,x):

	""" the module's needed, so load it and return the
	    requested attribute afterwards
        """
	#print self.__name__,'is looking for',x
	if not self.__moduleobj__.loaded:
	    self.load_module(x)
	    return getattr(self,x)
	else:
	    raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x)

def autoinstall():

    """ install the Lazy Module Import feature """

    mloader = FastModuleLoader()
    mhandler = LazyModule
    newimport = ModuleClassImporter(mhandler,mloader)
    newimport.install()

#
# auto-install as new 'import' (comment out, if you don't like this)
#
autoinstall()

--------------6726D2007785ACB228CD285A--

From gstein@lyra.org  Sat Feb  5 12:03:50 2000
From: gstein@lyra.org (Greg Stein)
Date: Sat, 5 Feb 2000 04:03:50 -0800 (PST)
Subject: [Import-sig] find/load? (was: deprecate ihooks?)
In-Reply-To: <025101bf6f3d$ea552500$f4a7b5d4@hagrid>
Message-ID: <Pine.LNX.4.10.10002050402031.8462-100000@nebula.lyra.org>

On Fri, 4 Feb 2000, Fredrik Lundh wrote:
>...
> we've also used ihooks in a number of places, with great
> success.  on the other hand, changing to imputil was hardly
> any work at all...
> 
> so I guess The Question is whether the find/load separation
> is really necessary.  I cannot think of a reason, but that's
> probably just me...

I've argued in the past that find/load is an inappropriate model for a
flexible import mechanism. I won't repeat that here, but if people would
like to see some reference material then I'll see if I can dig up those
threads.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Sat Feb  5 12:06:36 2000
From: gstein@lyra.org (Greg Stein)
Date: Sat, 5 Feb 2000 04:06:36 -0800 (PST)
Subject: [Import-sig] versions? (was: Kick-off)
In-Reply-To: <025601bf6f40$4aff5220$f4a7b5d4@hagrid>
Message-ID: <Pine.LNX.4.10.10002050404230.8462-100000@nebula.lyra.org>

How about we just start blazing a path. If we get it done before 1.6.0,
then we'll be happy. I don't see a particular reason to partition the
releases *before* we even know where we're going, what we'll build, and
how long it may take to complete that. In other words, let's ignore the
versions -- that's putting the cart before the horse.

I read Gordon's opening note as a way to focus attention. We start with
the short-term, finish that, then move onto the long-term.

Cheers,
-g

On Fri, 4 Feb 2000, Fredrik Lundh wrote:

> Gordon wrote:
> > Short-term: Provide a "new architecture import hooks" module 
> > for the standard library. This would deprecate ihooks and 
> > friends, and provide developers with a way of learning the new 
> > architecture.
> 
> 1.6.1 <= version <= 1.7, right?
> 
> > Long-term: Reform the entire import architecture of Python. 
> > This affects Python start-up, the semantics of sys.path, and 
> > the C API to importing.
> 
> 1.7 <= version < 3000, right?
> 
> > The model for this is, of course, Greg's imputil.py (Greg, your 
> > latest version is not yet on your website which still has a 
> > November version).
> 
> I'd like to add an ultra-short-term issue: possible changes
> to 1.6.0 that makes it easier to experiment with alternate
> import strategies, mostly for installation tools like gordon's
> install and pythonworks' deployment subsystem.
> 
> (as discussed on last week's consortium meeting)
> 
> most importantly, I'd like to come up with a way to execute
> small snippets of script code *before* Python attempts to
> import stuff like exceptions.py.
> 
> </F>
> 
> 
> _______________________________________________
> Import-sig mailing list
> Import-sig@python.org
> http://www.python.org/mailman/listinfo/import-sig
> 

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Sat Feb  5 12:28:49 2000
From: gstein@lyra.org (Greg Stein)
Date: Sat, 5 Feb 2000 04:28:49 -0800 (PST)
Subject: [Import-sig] RE: pythonpath and COM support
In-Reply-To: <ECEPKNMJLHAPFFJHDOJBOEDACEAA.mhammond@skippinet.com.au>
Message-ID: <Pine.LNX.4.10.10002050418510.8462-100000@nebula.lyra.org>

On Sat, 5 Feb 2000, Mark Hammond wrote:
>...
> > Couldn't pythoncom15.dll etc. live in /DLLs? Or are they
> > loaded by other C extensions (using LoadLibrary instead of
> > PyImport_x)?
> 
> pythoncom15.dll and pywintypes15.dll (infact, anything I release with the
> extension ".dll") is used as a "standard DLL".  There exists the possibility
> that a .exe will have an implicit reference to one of these .DLLs.  So
> unless they are in the same path as the executables, or on the %PATH%, they
> will not be found.

To be quite explicit about what Mark means here:

  If we have DLL foo.dll and an application or *another* DLL links against
  foo.dll, then we must place foo.dll on %PATH%. Current "design
  preference" in Windows is to avoid changing PATH, so the DLLs go into
  the System or System32 directory.

What DLLs go in there, and why?

  python15.dll: any Python extension module is going to link against this.
                Thus, when the extension is loaded (from wherever!), this
                module must be found.

  pywintypes15.dll: most of the Python/Win32 DLLs links against this for
                    some basic types.

  pythoncom15.dll: Python/COM extensions will link against this for
                   various pieces of functionality (base classes and
                   support functions)

> > > * The path searching code would need to use the location of
> > Python1x.dll,
> > > rather than the .exe, to locate the PYTHONHOME.  This would not
> > be a huge
> > > change, but necessary none-the-less.
> >
> > Um, why? Especially if they're the same directory ;-)?
> >
> > Oh, because of COM (and exposing python15.dll as a COM
> > server)?
> 
> Exactly - when a Python COM object is being used, the .exe may well be
> something created by VB, and no where near the Python directory.  Thus, the
> full path to the .exe will be useless, but the full path to Python1x.dll
> will be OK.

The path to the python15.dll won't help us -- it is sitting in System32.

There might be some weird voodoo in the COM stuff that can alter the load
path (AppPath?) for a COM server DLL, but I've never looked into it. I
seem to recall that it exists and allows the NT Loader to look in
specified directories for the dependent DLLs. For example, a COM server
registration says "use MyComExtension.dll for the COM server" and it also
says "use C:\Program Files\Python\DLLs" as an additional path. If it
exists, that would be nice.

I don't see that we can avoid putting those DLLs into the System directory
without some special provision for modifying the NT Loader's paths.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From mhammond@skippinet.com.au  Sat Feb  5 23:10:46 2000
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Sun, 6 Feb 2000 10:10:46 +1100
Subject: [Import-sig] RE: pythonpath and COM support
In-Reply-To: <Pine.LNX.4.10.10002050418510.8462-100000@nebula.lyra.org>
Message-ID: <ECEPKNMJLHAPFFJHDOJBOEECCEAA.mhammond@skippinet.com.au>

[Greg writes]

> > Exactly - when a Python COM object is being used, the .exe may well be
> > something created by VB, and no where near the Python
> directory.  Thus, the
> > full path to the .exe will be useless, but the full path to Python1x.dll
> > will be OK.
>
> The path to the python15.dll won't help us -- it is sitting in System32.

Except I thought we were discussing how to get python15.dll _out_ of
System32?

> There might be some weird voodoo in the COM stuff that can alter the load
> path (AppPath?) for a COM server DLL, but I've never looked into it. I
> seem to recall that it exists and allows the NT Loader to look in
> specified directories for the dependent DLLs. For example, a COM server
> registration says "use MyComExtension.dll for the COM server" and it also
> says "use C:\Program Files\Python\DLLs" as an additional path. If it
> exists, that would be nice.

Damn - yes - I believe you are correct. [Although my recollection is that
the AppPath exists for executables rather than COM objects - eg, we could
set up an AppPath for "Python.exe", but not for arbitary COM object names]

Lets say "MyVBApp.exe" uses a PythonCOM object.  This PythonCOM object can
happily point to a full path - eg "C:\Program Files\Python\Pythoncom15.dll".

Now, Pythoncom15.dll obviously links against Python15.dll.  However, the
search path rules dictate that the "path of the .exe" is used to locate the
extra DLLs.  So even though Python15.dll is in the same directory as
Pythoncom15.dll, Windows will not search that directory for Python15.dll -
it will search the path that "MyVBApp.exe" lives in, but not the paths that
other DLLs were found in.

*sigh* - I should have remembered that without Greg's prodding :-(  OTOH, I
really should test this out - maybe Windows is a little smarter than the
rules say it is.

Mark.

From guido@python.org  Sun Feb 13 17:05:52 2000
From: guido@python.org (Guido van Rossum)
Date: Sun, 13 Feb 2000 12:05:52 -0500
Subject: [Import-sig] Long-awaited imputil comments
Message-ID: <200002131705.MAA20578@eric.cnri.reston.va.us>

Hi Greg,

I've finally got to a full close-reading of imputil.py. There sure is
a lot of good stuff there.  At the same time, I think it's far from
ready for prime time in the distribution.  Here are some detailed
comments.  (I'd also like to have a high level discussion about its
fate, but I'll let to respond to this list first.)

class ImportManager:

    General comment:

        I would like to use/subclass the ImportManager in rexec (in
        order to emulate its behavior), but for that to work, I would
        need to change all references to sys.path and sys.modules (and
        sys.whatever) to self.whatever, but that would currently
        require rewriting large pieces of code. It would be nice if
        the sys module were somehow passed in.  I have a feeling the
        same is true for all references to the os module (_os_stat
        etc.) because the rexec module also wants to have control over
        what pieces of the filesystem you have access to.  This
        explains some of the complexity of ihooks (which is currently
	used by rexec).

    def install():

        The __chain_* code seems in transition (e.g. some functions
        end with both raise and return)

        The hook mechanism for 1.6 hasn't been designed yet; what
        should it be?

    def add_suffix():

        It seems the suffixes variable is only used by the
        _FilesystemImporter. Since it is shared, calls to add_suffix()
        will have an effect on the _FilesystemImporter instance. I
        think it would make more sense if the suffixes table was
        initialized and managed by the _FilesystemImporter; the
        add_suffix method on the ImportManager could then simply pass
        its arguments on to the _FilesystemImporter.

    def _import_hook():

        I think we need a hook here so that Marc-Andre can implement
        walk-me-up-Scotty; or Tim Peters could implement a policy that
        requires all imports to use the full module name, even from
        within the same package.

        top_module = sys.modules[parts[0]]:
            There's an undocumented convention that sys.modules[x] may
            be None, to indicate that module x doesn't exist and that
            we shouldn't try to look for it any further. This is used
            by package import to avoid excessive filesystem access in
            the case where modules in a package import top-level
            modules. E.g. we're in package P, module M.  Now we see
            "import string". The "local imports override global
            imports" rule requres that we first look for P.string,
            which of course doesn't exist, before we look for string
            in sys.path.  The first time we look for P.string, we
            actually do a bunch of stats: for P/string.py,
            P/string.pyc, P/string.pyd, P/string.dll.  When other
            submodules of package P also import string, they would
            each incur all these stat() calls, unless we somehow
            remebered that there's no P.string. This is taken care of
            by setting sys.modules['P.string'] = None.

            Anyway, I think that your logic here doesn't expect that
            to happen.  A fix could be to put "if not top_module:
            raise KeyError" inside the try/except.

    def _determine_import_context():

        Possible feature: the package could set a flag here to force
        all imports to go to the top-level (i.e., the flag would mean
        "no relative imports").

    def _import_top_module():

        Instead of type(item)..., use isinstance().

        Looking forward to _FilesystemImporter: I want to be able to
        have the *name* of a zip file (or other archive, whatever is
        supported) in sys.path; that seems more convenient for the
	user and simplifies $PYTHONPATH processing.

    def _reload_hook():

        Note that reload() does NOT blast the module's dict; for
	better or for worse.  (Occasionally modules know this and save
	important global data.)

class Importer:

    def install():

        This should be a method on the manager.  (That makes it easier
	to change references to sys.path etc.; see my rexec notes above.)

    def import_top():

        This appears a hook that a base class can override; but why?

    "PRIVATE METHODS":

        These aren't really private to the class; some are used from
	the ImportManager class.

        I note that none of these use any state of the class (it
        doesn''t *have* any state) and they are essentially an
        implementation of the (cumbersome) package import policy.
        I wonder if the whole package import policy shouldn't be
        implemented in the ImportManager class -- or even in a
	separate policy class.

    def get_code():

        On the 2/3-tuple return value: a proposal for code to be
        included in 1.6 shouldn't be encumbered by backwards
        compatibility issues to previous versions of the proposal.

        I'm still worried about get_code() returning the code object
        (or even a module, in the case of extensions).  This means
        that something like freeze might have to re-implement the
        whole module searching -- first, it might want the source code
        instead of the code object, and second, it might not want the
        extension to be loaded at all!

        I've noticed that all implementations of get_code() start with
        a test whether parent is None or not, and branch to completely
        different code.  This suggests that we might have two separate
        methods???

"Some handy stuff for the Importers":

    This seems to be a remnant of an older imputil.py version; it
    appears to be unused by the current code or at least there is some
    code duplication; e.g. _c_suffixes is also calculated in
    ImportManager.

    def _compile():

        You will have to strip CRLF from the code string read from the
        file; this is for Unix where opening the file in text mode
        doesn't do that, but users have come to expect this because
        the current implementation explicitly strips them.

        I've recently been notified of a race condition in the code
        here, when two processes are writing a .pyc file and a third
        is reading it.  On Unix we will have to remove the .pyc file
        and then use os.open() with O_EXCL in order to avoid the race
        condition.  I don't know how to avoid it on Windows.

    def _os_bootstrap():

        This is ugly and contains a dangerous repetition of code from
        os.py.  I know why you want it but for the "real" version we
        need a different approach here.

    def _fs_import():

        I think this is unused now?  Anyway, it hardcodes the suffix
        sequence.

        The test t_pyc >= t_py does not match the current
        implementation and should not affect the outcome.  (But it
        does save a stat() call in a common case...)

        The call to _compile() may raise SyntaxError (and also
        OverflowError and maybe a few others). I don't know what to do
        about this, but the traceback in that case will look really
        ugly!

class _FilesystemImporter:

    See comments above about suffix list management.

class SuffixImporter:

    Why is this a class at all?  It would seem to be sufficient to
    have a table of functions instead of a table of instances.  These
    importers have no state and only one method that is always
    overridden.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Wed Feb 16 14:01:32 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 16 Feb 2000 06:01:32 -0800 (PST)
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002131705.MAA20578@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>

On Sun, 13 Feb 2000, Guido van Rossum wrote:
> I've finally got to a full close-reading of imputil.py. There sure is
> a lot of good stuff there.

Excellent! Thanx for taking the time and providing the great feedback.

> At the same time, I think it's far from
> ready for prime time in the distribution.

Understood. I think some of your concerns are based on its historical
cruftiness, rather than where it "should" be. I'll prep a new version with
the backwards compat pulled out. People that relied on the old version can
continue to use their older version (which is Public Domain, so they're
quite free to do so :-)

>...
> class ImportManager:
> 
>     General comment:
> 
>         I would like to use/subclass the ImportManager in rexec (in
>         order to emulate its behavior), but for that to work, I would
>         need to change all references to sys.path and sys.modules (and
>         sys.whatever) to self.whatever, but that would currently
>         require rewriting large pieces of code. It would be nice if
>         the sys module were somehow passed in.  I have a feeling the
>         same is true for all references to the os module (_os_stat
>         etc.) because the rexec module also wants to have control over
>         what pieces of the filesystem you have access to.  This
>         explains some of the complexity of ihooks (which is currently
> 	used by rexec).

All right. Shouldn't be a problem.

>     def install():
> 
>         The __chain_* code seems in transition (e.g. some functions
>         end with both raise and return)

The raise/return thingy is simply because I wanted to catch all cases
where it didn't import the module and was about to head towards the
default importer via the chain.

>         The hook mechanism for 1.6 hasn't been designed yet; what
>         should it be?

Part of this depends on the policy that you would like to proscribe. There
are two issues that I can think of:

1) are import hooks per-builtin-namespace, or per-interpreter? The former
   is the current model, and may be necessary for rexec types of
   functionality. The latter would shift the hook to some functions in the
   sys module, much like the profiler/trace hooks.

2) currently, Python has a single hook, with no provisions for multiple
   mechanisms to be used simultaneously. This was one of the primary
   advantages of imputil and its policies/chaining -- it would allow
   multiple import mechanisms. I believe that we want to continue the
   current policy: the core interpreter sees a single hook function.

   [ and Standard Operating Procedure is to install an ImportManager in
     there, or a subclass ]

For a while, I had thought about the "sys" approach, but just realized
that may be too limited for rexec types of environments (because we may
need a couple ImportManagers to be operating within an interpreter)

BUT: we may be able to design an RExecImportManager that remembers
restricted modules and uses different behavior when it determines the
import context is one of those modules. I haven't thought on this to
determine is true viability, tho...

>     def add_suffix():
> 
>         It seems the suffixes variable is only used by the
>         _FilesystemImporter. Since it is shared, calls to add_suffix()
>         will have an effect on the _FilesystemImporter instance. I
>         think it would make more sense if the suffixes table was
>         initialized and managed by the _FilesystemImporter; the
>         add_suffix method on the ImportManager could then simply pass
>         its arguments on to the _FilesystemImporter.

Agreed. ImportManager had the suffix list for a while because it needed
it. The code was shifted to _FilesystemImporter, I didn't revisit the
placement of the suffixes. I think that I was also leaving it there with
an intent that it is part of the public interface, subject to more complex
manipulations than the simple add_suffix() would allow. I believe we can
solve this latter problem, though, just by adding a get_suffixes() method
that fetches the list from fs_imp.

>     def _import_hook():
> 
>         I think we need a hook here so that Marc-Andre can implement
>         walk-me-up-Scotty; or Tim Peters could implement a policy that
>         requires all imports to use the full module name, even from
>         within the same package.

I've been thinking of something along the lines of
_determine_import_context() returning a list of things to try. Default is
to return something like [current-context,] + sys.path (not exactly that,
but you get the idea). The _import_hook would then operate as a simple
scan over that list of places to attempt an import from. MAL could
override _determine_import_context() to return the walk-me-up, intervening
packages. Tim could just always return sys.path (and never bother trying
to determine the current context).

>         top_module = sys.modules[parts[0]]:
>             There's an undocumented convention that sys.modules[x] may
>             be None, to indicate that module x doesn't exist and that
>...

Yes, I'm familiar with that mechanism, and it would be a good addition.

>...
>             Anyway, I think that your logic here doesn't expect that
>             to happen.  A fix could be to put "if not top_module:
>             raise KeyError" inside the try/except.

I've attempted to avoid it, but this recent rewrite may have introduced
things like this -- where ImportManager operates as if it the *only* thing
performing imports. It's certainly friendly when it doesn't see __ispkg__
or __importer__, but this top_module thing is a valid bug. Quite fixable.

>     def _determine_import_context():
> 
>         Possible feature: the package could set a flag here to force
>         all imports to go to the top-level (i.e., the flag would mean
>         "no relative imports").

Ah. Neat optimization. I'll insert some comments / prototype code to do
this.

>     def _import_top_module():
> 
>         Instead of type(item)..., use isinstance().

Can do.

>         Looking forward to _FilesystemImporter: I want to be able to
>         have the *name* of a zip file (or other archive, whatever is
>         supported) in sys.path; that seems more convenient for the
> 	user and simplifies $PYTHONPATH processing.

Hmm. This will complicate things quite a bit, but is doable. It will also
increase the processing time for sys.path elements.

I'll think of a design and maybe prototype something up after the current 
round of changes.

>     def _reload_hook():
> 
>         Note that reload() does NOT blast the module's dict; for
> 	better or for worse.  (Occasionally modules know this and save
> 	important global data.)

All righty.

> class Importer:
> 
>     def install():
> 
>         This should be a method on the manager.  (That makes it easier
> 	to change references to sys.path etc.; see my rexec notes above.)

This is here for backwards compat. I'll remove it.

>     def import_top():
> 
>         This appears a hook that a base class can override; but why?

It is for use by clients of the Importer. In particular, by the
ImportManager. It is not intended to be overridden.

>     "PRIVATE METHODS":
> 
>         These aren't really private to the class; some are used from
> 	the ImportManager class.

Private to the system, then :-)

>         I note that none of these use any state of the class (it
>         doesn''t *have* any state) and they are essentially an
>         implementation of the (cumbersome) package import policy.
>         I wonder if the whole package import policy shouldn't be
>         implemented in the ImportManager class -- or even in a
> 	separate policy class.

Nope. They do use a piece of state: self. Subclasses may add state and
they need to refer to that state via self. We store Importer instances
into modules as __importer__ and then use it later. The example Importers
(FuncImporter, PackageImporter, DirectoryImporter, and PathImporter) all
use instance variables. _FilesystemImporter definitely requires it.

Once we locate the Importer responsible for a package and its contained
modules, then we pass off control to that Importer. It is then responsible
for completing the import (including the notions of a Python package). The
completion of the import can be moved out of Importer, and we would
replace "self" by the Importer in question.

I attempted to minimize code in ImportManager because a typical execution
will only have a single instance of it -- there is no opportunity to
implement different policies and mechanisms. By shifting as much as
possible out to the Importer class, there is more opportunity for altering
behavior.

>     def get_code():
> 
>         On the 2/3-tuple return value: a proposal for code to be
>         included in 1.6 shouldn't be encumbered by backwards
>         compatibility issues to previous versions of the proposal.

No problem. Consider the 2-tuple form to be gone.

>         I'm still worried about get_code() returning the code object
>         (or even a module, in the case of extensions).  This means
>         that something like freeze might have to re-implement the
>         whole module searching -- first, it might want the source code
>         instead of the code object, and second, it might not want the
>         extension to be loaded at all!

You're asking for something that is impossible in practice. Consider the
following code fragment:

------------------------
import my_imp_tools
my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/")

import qp_xml
------------------------

There is no way that you can create a freeze tool that will be able to
manage this kind of scenario.

I've frozen software for three different, shipping products. In all cases,
I had to custom-build a freeze mechanism. Each time, there would be a list
of "root" scripts, a set of false-positives to ignore, a set of missed
modules to force inclusion, and a set of binary dependencies that couldn't
be determined otherwise. I think the desire for a fully-automated module
finder and freezer isn't fulfillable.

That said: I wouldn't be opposed to adding a get_source() method to an
Importer. If the source is available, then it can return it (it may not be
available in an archive!).

>         I've noticed that all implementations of get_code() start with
>         a test whether parent is None or not, and branch to completely
>         different code.  This suggests that we might have two separate
>         methods???

Interesting point. I hadn't noticed that. They don't always branch to
different points, however: consider DirectoryImporter and FuncImporter.

Essentially, we have two forms of calls:

    get_code(None, modname, modname)     # look for a top-level module
    get_code(parent, modname, fqname)    # look in a package for a module

imputil has been quite nice because you only had to worry about one hook,
but separating these might be a good thing to do. My preference is to
leave it as one hook, but let's see what others have to say...

> "Some handy stuff for the Importers":

Consider all this torched.

I'll also move the Importer subclasses to a new file for placement under
Demo/. That should trim imputil.py down quite a lot.

>...
>     def _compile():
> 
>         You will have to strip CRLF from the code string read from the
>         file; this is for Unix where opening the file in text mode
>         doesn't do that, but users have come to expect this because
>         the current implementation explicitly strips them.

I'm not sure that I follow what you mean here. The existing code seems to
work fine. We *append* a newline; are you suggesting stripping inside the
codestring, at the end, ??

>         I've recently been notified of a race condition in the code
>         here, when two processes are writing a .pyc file and a third
>         is reading it.  On Unix we will have to remove the .pyc file
>         and then use os.open() with O_EXCL in order to avoid the race
>         condition.  I don't know how to avoid it on Windows.

The O_EXCL would be on writing. It would be nice if there was a "shared"
mode for reading. How are two writers and a reader different from a single
writer and a single reader?

>     def _os_bootstrap():
> 
>         This is ugly and contains a dangerous repetition of code from
>         os.py.  I know why you want it but for the "real" version we
>         need a different approach here.

I agree :-)

But short of some refactoring of os.py and the per-platform modules, this
is the best that I could do. Importing "os" loads a lot of stuff; I wanted
to ensure that we deferred that until we had the ImportManager and
associated Importers in place (so the import could occur under the
direction of imputil).

>     def _fs_import():
> 
>         I think this is unused now?  Anyway, it hardcodes the suffix
>         sequence.

Unused. It can be ignored.

> class SuffixImporter:
> 
>     Why is this a class at all?  It would seem to be sufficient to
>     have a table of functions instead of a table of instances.  These
>     importers have no state and only one method that is always
>     overridden.

DynLoadSuffixImporter has state. We could still deal with that, however,
by just storing a bound method into the table.

I used instances because I wasn't sure what all we might want in there. If
we don't add any other methods or attributes to the public interface, then
yah: we could switch to a function-based approach.

I'll release a new imputil later this week, incorporating these changes,
MAL's feedback, and Finn Bock's feedback.

Thanx!
-g

-- 
Greg Stein, http://www.lyra.org/

From guido@python.org  Wed Feb 16 17:27:14 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 16 Feb 2000 12:27:14 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 06:01:32 PST."
 <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
Message-ID: <200002161727.MAA28273@eric.cnri.reston.va.us>

> Excellent! Thanx for taking the time and providing the great feedback.

And thanks for the replies.  Quick responses:

> > At the same time, I think it's far from
> > ready for prime time in the distribution.
> 
> Understood. I think some of your concerns are based on its historical
> cruftiness, rather than where it "should" be. I'll prep a new version with
> the backwards compat pulled out. People that relied on the old version can
> continue to use their older version (which is Public Domain, so they're
> quite free to do so :-)

OK, I'm awaiting a new imputil announcement.  (You should really have
a webpage for it pointing both to the old and the new version.)

> >         The hook mechanism for 1.6 hasn't been designed yet; what
> >         should it be?
> 
> Part of this depends on the policy that you would like to proscribe. There
> are two issues that I can think of:
> 
> 1) are import hooks per-builtin-namespace, or per-interpreter? The former
>    is the current model, and may be necessary for rexec types of
>    functionality. The latter would shift the hook to some functions in the
>    sys module, much like the profiler/trace hooks.

Yes, per-builtin-namespace is necessary because of rexec -- the system
must translate an import statement into a call to a hook that depends
on the builtin namespace, because that's how rexec environments (there
may be many per interpreter) are denoted.

> 2) currently, Python has a single hook, with no provisions for multiple
>    mechanisms to be used simultaneously. This was one of the primary
>    advantages of imputil and its policies/chaining -- it would allow
>    multiple import mechanisms. I believe that we want to continue the
>    current policy: the core interpreter sees a single hook function.

Yes.

>    [ and Standard Operating Procedure is to install an ImportManager in
>      there, or a subclass ]

Yes.

> For a while, I had thought about the "sys" approach, but just realized
> that may be too limited for rexec types of environments (because we may
> need a couple ImportManagers to be operating within an interpreter)

Agreed.  It sounds like we're stuck with overriding
__builtin__.__import__, like before.

> BUT: we may be able to design an RExecImportManager that remembers
> restricted modules and uses different behavior when it determines the
> import context is one of those modules. I haven't thought on this to
> determine is true viability, tho...

Sounds tricky -- *all* modules imported in restricted mode must be
treated differently.

> >     def _import_hook():
> > 
> >         I think we need a hook here so that Marc-Andre can implement
> >         walk-me-up-Scotty; or Tim Peters could implement a policy that
> >         requires all imports to use the full module name, even from
> >         within the same package.
> 
> I've been thinking of something along the lines of
> _determine_import_context() returning a list of things to try. Default is
> to return something like [current-context,] + sys.path (not exactly that,
> but you get the idea). The _import_hook would then operate as a simple
> scan over that list of places to attempt an import from. MAL could
> override _determine_import_context() to return the walk-me-up, intervening
> packages. Tim could just always return sys.path (and never bother trying
> to determine the current context).

Yes.  In ni (remember ni?) we had this mechanism; it was called
"domain" (not a great name for it).  The domain was a list of packages
where relative imports were sought.  A package could set its domain by
setting a variable __domain__.  The current policy (current package,
then toplevel) corresponds to a 2-item domain: [<packagename>, ""]
<(where "" stands for the unnamed toplevel package).
Walk-me-up-Scotty corresponds to a domain containing the current
package, its parent, its grandparent, and so on, ending with "".  The
"no relative imports" policy is represented [""].

If we let __domain__ be initialized by the importer but overridden by
the package, we can do everything we need.

> >     def _determine_import_context():
> > 
> >         Possible feature: the package could set a flag here to force
> >         all imports to go to the top-level (i.e., the flag would mean
> >         "no relative imports").
> 
> Ah. Neat optimization. I'll insert some comments / prototype code to do
> this.

See above -- the package could simply set its domain to [""].

> >         Looking forward to _FilesystemImporter: I want to be able to
> >         have the *name* of a zip file (or other archive, whatever is
> >         supported) in sys.path; that seems more convenient for the
> > 	user and simplifies $PYTHONPATH processing.
> 
> Hmm. This will complicate things quite a bit, but is doable. It will also
> increase the processing time for sys.path elements.

We could cache this in a dictionary: the ImportManager can have a
cache dict mapping pathnames to importer objects, and a separate
method for coming up with an importer given a pathname that's not yet
in the cache.  The method should do a stat and/or look at the
extension to decide which importer class to use; you can register new
importer classes by registering a suffix or a Boolean function, plus a
class.  If you register a new importer class, the cache is zapped.
The cache is independent from sys.path (but maintained per
ImportManager instance) so that rearrangements of sys.path do the
right thing.  If a path is dropped from sys.path the corresponding
cache entry is simply no longer used.

> >     def import_top():
> > 
> >         This appears a hook that a base class can override; but why?
> 
> It is for use by clients of the Importer. In particular, by the
> ImportManager. It is not intended to be overridden.

Are there any other clients of the Importer class?  Since import_top()
simply calls _import_one(), do we even need it?

> >     "PRIVATE METHODS":
> > 
> >         These aren't really private to the class; some are used from
> > 	the ImportManager class.
> 
> Private to the system, then :-)

Then use a different word -- "private" has a well-defined meaning for
C++ and Java programmers.  To me, "internal" sounds better.

> >         I note that none of these use any state of the class (it
> >         doesn''t *have* any state) and they are essentially an
> >         implementation of the (cumbersome) package import policy.
> >         I wonder if the whole package import policy shouldn't be
> >         implemented in the ImportManager class -- or even in a
> > 	separate policy class.
> 
> Nope. They do use a piece of state: self. Subclasses may add state and
> they need to refer to that state via self. We store Importer instances
> into modules as __importer__ and then use it later. The example Importers
> (FuncImporter, PackageImporter, DirectoryImporter, and PathImporter) all
> use instance variables. _FilesystemImporter definitely requires it.
> 
> Once we locate the Importer responsible for a package and its contained
> modules, then we pass off control to that Importer. It is then responsible
> for completing the import (including the notions of a Python package). The
> completion of the import can be moved out of Importer, and we would
> replace "self" by the Importer in question.
> 
> I attempted to minimize code in ImportManager because a typical execution
> will only have a single instance of it -- there is no opportunity to
> implement different policies and mechanisms. By shifting as much as
> possible out to the Importer class, there is more opportunity for altering
> behavior.

But the importer is the wrong place to change the policy globally.
The example is walk-me-up-Scotty: to implement that (or, more
generally, to implement the __domain__ hook) without editing
imputil.py, Marc-Andre would have to have to subclass all the
importers that are used.  If the policy was embodied in the
ImportManager class, he could subclass the ImportManager, install it
instead of the default one, but continue to use the existing importers
(e.g. to import from zip files).

> >         I'm still worried about get_code() returning the code object
> >         (or even a module, in the case of extensions).  This means
> >         that something like freeze might have to re-implement the
> >         whole module searching -- first, it might want the source code
> >         instead of the code object, and second, it might not want the
> >         extension to be loaded at all!
> 
> You're asking for something that is impossible in practice. Consider the
> following code fragment:
> 
> ------------------------
> import my_imp_tools
> my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/")
> 
> import qp_xml
> ------------------------
> 
> There is no way that you can create a freeze tool that will be able to
> manage this kind of scenario.
> 
> I've frozen software for three different, shipping products. In all cases,
> I had to custom-build a freeze mechanism. Each time, there would be a list
> of "root" scripts, a set of false-positives to ignore, a set of missed
> modules to force inclusion, and a set of binary dependencies that couldn't
> be determined otherwise. I think the desire for a fully-automated module
> finder and freezer isn't fulfillable.

I know you can't do it fully automatically.  But I still want to be
able to reuse as much of existing importer and importmanager classes
as possible.  Currently, Tools/freeze/modulefinder.py contains a lot
of code that reimplements the entire package resolution mechanism.
It should really be able to reuse all that -- so that e.g. the
addition of a __domain__ feature doesn't require changes both to
imputil.py and to modulefinder.py.

> That said: I wouldn't be opposed to adding a get_source() method to an
> Importer. If the source is available, then it can return it (it may not be
> available in an archive!).

How would you invoke it?  I would need to have a different call into
the ImportManager that would invoke get_source() rather than
get_code().  My desire is that freeze's modulefinder should be able to
instantiate another ImportManager (maybe a slight subclass) which
would find the source code for modules for it.  For this, I need to
have an API that say "in this context, what module does 'import X'
return?".  It's okay to return some kind of descriptor object that has
methods to (1) get the code, (2) get the source, (3) describe itself.
The description should return whether this is a Python or an extension
module, whether source is available, and the filename if available.
If it came from an archive, the descriptor could be archive-aware, and
have additional methods to find out the filename of the archive and
the name of the module inside the archive, as well as the type of
archive.  It it was an extension, there would be extra calls to load
the library and to initialize the module, and a way to get the actual
filename.  This way, freeze could issue reasonable errors if it found
a module but couldn't find the source.  Freeze also needs to deal
specially with extension (and differently with built-in extensions and
shared library ones).

> >         I've noticed that all implementations of get_code() start with
> >         a test whether parent is None or not, and branch to completely
> >         different code.  This suggests that we might have two separate
> >         methods???
> 
> Interesting point. I hadn't noticed that. They don't always branch to
> different points, however: consider DirectoryImporter and FuncImporter.

But those are the most trivial examples.

> Essentially, we have two forms of calls:
> 
>     get_code(None, modname, modname)     # look for a top-level module
>     get_code(parent, modname, fqname)    # look in a package for a module
> 
> imputil has been quite nice because you only had to worry about one hook,
> but separating these might be a good thing to do. My preference is to
> leave it as one hook, but let's see what others have to say...

You might provide a default implementation for get_code() that calls
either get_subcode() or get_topcode() depending on whether parent is
None; then subclasses can separately override those, or override
get_code() when it's more convenient.

I noticed there's only one call to get_code(), from _import_one().
Not sure where this leads though.

> I'll also move the Importer subclasses to a new file for placement under
> Demo/. That should trim imputil.py down quite a lot.

Except for the _FilesystemImporter class, I presume, which is needed
in normal use.

> >...
> >     def _compile():
> > 
> >         You will have to strip CRLF from the code string read from the
> >         file; this is for Unix where opening the file in text mode
> >         doesn't do that, but users have come to expect this because
> >         the current implementation explicitly strips them.
> 
> I'm not sure that I follow what you mean here. The existing code seems to
> work fine. We *append* a newline; are you suggesting stripping inside the
> codestring, at the end, ??

I mean that you have to do

  codestring = re.sub(r"\r\n", r"\n", codestring)

on the code string after reading it.  This has nothing to do with the
trailing newline.  It is needed because the tokenizer chokes on \r\n
when it finds it in a string, but not when it reads it from a file --
this has been reported a few times as a bug because it means that

  exec compile(open(fn).read(), fn, "exec")

is not completely equivalent to execfile(fn) -- the compile() may
choke if the read() returns lines ending in \r\n, as it may when a
Windows file was transplanted to a Unix system.  Again, when the
parser itself reads the file, this is dealt with correctly.  This is a
feature: many people share filesystems between Unix and Windows, and
just like most Windows compilers don't insist on the \r being there,
Unix tools shouldn't insist on it being absent.  Fixing compile() is
hard, unfortunately, hence this request for a workaround.

> >         I've recently been notified of a race condition in the code
> >         here, when two processes are writing a .pyc file and a third
> >         is reading it.  On Unix we will have to remove the .pyc file
> >         and then use os.open() with O_EXCL in order to avoid the race
> >         condition.  I don't know how to avoid it on Windows.
> 
> The O_EXCL would be on writing.

Yes of course, sorry for not clarifying that.

> It would be nice if there was a "shared"
> mode for reading. How are two writers and a reader different from a single
> writer and a single reader?

OK, I'll explain the problem.  (Cut from a mail explaining it to
Jeremy:)

| Unfortunately, there's still a race condition, involving three or more
| processes:
| 
|   A sees no .pyc file and starts writing it
| 
|   B sees an invalid .pyc file and decides to go write it later
| 
|   A finishes writing and fills in the mtime
| 
|   C sees the valid magic and mtime and decides to go read the .pyc file
| 
|   B overwrites the .pyc file, truncating it at first
| 
|   C continues to read the .pyc file, but sees a truncated file
| 
|   B finishes writing and fills in the mtime
| 
| At this point, the .pyc file is valid, but process C has probably
| crashed in the unmarshalling code.
| 
| I have devised the following solution (which may even work on
| Windows) but not yet implemented it:
| 
| when writing the .pyc file, use unlink() to remove the .pyc file and
| then use low-level open() with the proper flags to require that the
| file doesn't yet exist (O_EXC:?); then use fdopen().  If the open()
| fails, don't write (treat it the same as a failing fopen() now.)
| 
| (You'd think that you could use a temporary file, but it's hard to
| come up with a temp filename that's unique -- and if it's not unique,
| the same race condition could still happen.)

> >     def _os_bootstrap():
> > 
> >         This is ugly and contains a dangerous repetition of code from
> >         os.py.  I know why you want it but for the "real" version we
> >         need a different approach here.
> 
> I agree :-)
> 
> But short of some refactoring of os.py and the per-platform modules, this
> is the best that I could do. Importing "os" loads a lot of stuff; I wanted
> to ensure that we deferred that until we had the ImportManager and
> associated Importers in place (so the import could occur under the
> direction of imputil).

OK.  Let's table this one until we feel we know how to refactor os.py.
(Maybe a platform-specific os.py could be frozen into the
interpreter.)

> > class SuffixImporter:
> > 
> >     Why is this a class at all?  It would seem to be sufficient to
> >     have a table of functions instead of a table of instances.  These
> >     importers have no state and only one method that is always
> >     overridden.
> 
> DynLoadSuffixImporter has state. We could still deal with that, however,
> by just storing a bound method into the table.

Exactly.

> I used instances because I wasn't sure what all we might want in there. If
> we don't add any other methods or attributes to the public interface, then
> yah: we could switch to a function-based approach.

See http://c2.com/cgi-bin/wiki?YouArentGonnaNeedIt (and the
rest of this wikiweb on refactoring, patterns etc.) for why you
shouldn't plan ahead this far.

> I'll release a new imputil later this week, incorporating these changes,
> MAL's feedback, and Finn Bock's feedback.

Great!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From just@letterror.com  Wed Feb 16 18:40:51 2000
From: just@letterror.com (Just van Rossum)
Date: Wed, 16 Feb 2000 19:40:51 +0100
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us>
References: Your message of "Wed, 16 Feb 2000 06:01:32 PST."
 <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
 <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
Message-ID: <l0310280eb4d09ccd6c28@[193.78.237.172]>

I kindof lost track here, is it ok if I take a step back?

I don't like the chaining aspect of imputil.py. I think I remember that
Greg did this mainly to remain as compatible as possible, but I don't see a
reason to keep it that way. What you end up with is a linked list, which
seems, uh, a little un-pythonic... (and a little awkward to manipulate.)

What's needed is pluggable importers. Once proposal I remember (I think it
was also Greg's) was that elements on sys.path could be importer instances.
Is this still being proposed?
I also vaguely remember someone saying that this was not optimal, since
it's a 2-dimensional problem: there's a list of directories/files to
search, and a list of importers. Would it make sense to add a new variable
to the sys module, called "importers" or something, which contains a list
of, erm, importers? And drop the __import__ hook. People would plug the
import mechanism by manipulating sys.importers instead of mucking with
__builtin__.__import__. Importers could then traverse sys.path, or use
their own list of things.

Just

From gmcm@hypernet.com  Wed Feb 16 19:56:49 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 16 Feb 2000 14:56:49 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <l0310280eb4d09ccd6c28@[193.78.237.172]>
References: <200002161727.MAA28273@eric.cnri.reston.va.us>
Message-ID: <1261391513-4256760@hypernet.com>

Just wrote:

> I kindof lost track here, is it ok if I take a step back?
> 
> I don't like the chaining aspect of imputil.py. 
[snip]
> What's needed is pluggable importers. Once proposal I remember (I think it
> was also Greg's) was that elements on sys.path could be importer instances.
> Is this still being proposed?

The "stable" version of imputil uses a chain. The CVS version 
uses sys.path to hold importers (well, it's got one foot in each 
world).

> I also vaguely remember someone saying that this was not optimal, since
> it's a 2-dimensional problem: there's a list of directories/files to
> search, and a list of importers. 

People were thinking in terms of "policy" importers - which 
would, indeed, be unmanagable. Importers are based on "turf" 
(to which we're currently trying to add "policy" hooks, at least 
for certain kinds of importers). So a decently written importer 
knows very quickly whether the request belongs to him.

> Would it make sense to add a new variable
> to the sys module, called "importers" or something, which contains a list
> of, erm, importers? And drop the __import__ hook. People would plug the
> import mechanism by manipulating sys.importers instead of mucking with
> __builtin__.__import__. Importers could then traverse sys.path, or use
> their own list of things.

As it stands now (CVS version, looking to be in the std dist, 
but not the core), an ImportManager uses the __import__ 
hook and developers put their importers on sys.path.

A policy hook would probably go into the FileSystemImporter, 
(where it would get activated when a directory on sys.path 
was searched), but it wouldn't go searching sys.path itself.

- Gordon

From gmcm@hypernet.com  Wed Feb 16 20:43:54 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 16 Feb 2000 15:43:54 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
References: <200002131705.MAA20578@eric.cnri.reston.va.us>
Message-ID: <1261388683-4430097@hypernet.com>

[Guido]
> >         I'm still worried about get_code() returning the code object
> >         (or even a module, in the case of extensions).  This means
> >         that something like freeze might have to re-implement the
> >         whole module searching -- first, it might want the source code
> >         instead of the code object, and second, it might not want the
> >         extension to be loaded at all!

If a freeze mechanism is analyzing source, then it needs 
source, but I don't think that's necessary. Then only other 
reason I can see for wanting source is if freeze is running with 
one magic number, but the frozen code will run with another, 
(to which I say "tough toenails, tootsie").

[Greg]
> You're asking for something that is impossible in practice. Consider the
> following code fragment:
> 
> ------------------------
> import my_imp_tools
> my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/")
> 
> import qp_xml
> ------------------------
> 
> There is no way that you can create a freeze tool that will be able to
> manage this kind of scenario.

Sure. Dynamically replace the Importer in __bases__ with a 
hacked one that doesn't affect sys.modules, grabs the code 
object and analyzes byte code (like modulefinder does) to find 
further imports.

> I've frozen software for three different, shipping products. In all cases,
> I had to custom-build a freeze mechanism. Each time, there would be a list
> of "root" scripts, a set of false-positives to ignore, a set of missed
> modules to force inclusion, and a set of binary dependencies that couldn't
> be determined otherwise. I think the desire for a fully-automated module
> finder and freezer isn't fulfillable.

False-positives are unavoidable. Missed modules would be no 
worse and probably better than today (since this scheme 
would use the importer to trace the actions of the importer).

- Gordon

From gmcm@hypernet.com  Wed Feb 16 20:43:53 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 16 Feb 2000 15:43:53 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us>
References: Your message of "Wed, 16 Feb 2000 06:01:32 PST."             <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
Message-ID: <1261388665-4430113@hypernet.com>

[Guido]
> > >         I think we need a hook here so that Marc-Andre can implement
> > >         walk-me-up-Scotty; or Tim Peters could implement a policy that
> > >         requires all imports to use the full module name, even from
> > >         within the same package.

Hmm, after thinking about it, I can't see these as "import 
hooks". At least, if you are installing these system-wide, 
these are changing the semantics of "import", not grabbing 
code from strange places, or transforming .xyz files into 
Python or ...

I can have working code; now add some mxX or TP extension 
and have existing code (not using the new extension) break.

OTOH, if MAL / TP provides an importer, and fixes that 
importer to follow their preferred policy, that's fine; and I can 
pretend that that's an "import hook".

- Gordon

From guido@python.org  Wed Feb 16 20:52:08 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 16 Feb 2000 15:52:08 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 15:43:54 EST."
 <1261388683-4430097@hypernet.com>
References: <200002131705.MAA20578@eric.cnri.reston.va.us>
 <1261388683-4430097@hypernet.com>
Message-ID: <200002162052.PAA29056@eric.cnri.reston.va.us>

> [Guido]
> > >         I'm still worried about get_code() returning the code object
> > >         (or even a module, in the case of extensions).  This means
> > >         that something like freeze might have to re-implement the
> > >         whole module searching -- first, it might want the source code
> > >         instead of the code object, and second, it might not want the
> > >         extension to be loaded at all!

[Gordon]
> If a freeze mechanism is analyzing source, then it needs 
> source, but I don't think that's necessary. Then only other 
> reason I can see for wanting source is if freeze is running with 
> one magic number, but the frozen code will run with another, 
> (to which I say "tough toenails, tootsie").

Maybe my freezer wants to store the source as well as the code
objects, so it can give decent tracebacks.

Which reminds me -- we need to introduce a standard API to retrieve
the source for a module that's been imported (if it's available at
all).  I can easily see how archives can be distributed containing
both .pyc and .py files; the zip access module could easily find the
.py file on request.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido@python.org  Wed Feb 16 20:54:27 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 16 Feb 2000 15:54:27 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 15:43:53 EST."
 <1261388665-4430113@hypernet.com>
References: Your message of "Wed, 16 Feb 2000 06:01:32 PST." <Pine.LNX.4.10.10002160425560.17758-100000@nebula.lyra.org>
 <1261388665-4430113@hypernet.com>
Message-ID: <200002162054.PAA29068@eric.cnri.reston.va.us>

> [Guido]
> > > >         I think we need a hook here so that Marc-Andre can implement
> > > >         walk-me-up-Scotty; or Tim Peters could implement a policy that
> > > >         requires all imports to use the full module name, even from
> > > >         within the same package.

[Gordon]
> Hmm, after thinking about it, I can't see these as "import 
> hooks". At least, if you are installing these system-wide, 
> these are changing the semantics of "import", not grabbing 
> code from strange places, or transforming .xyz files into 
> Python or ...
> 
> I can have working code; now add some mxX or TP extension 
> and have existing code (not using the new extension) break.
> 
> OTOH, if MAL / TP provides an importer, and fixes that 
> importer to follow their preferred policy, that's fine; and I can 
> pretend that that's an "import hook".

Good point.

Fortunately, the proposed solution (reintroducing __domain__) lets
this be solved on a per-package basis.

Still, I want to be able to subclass ImportManager to change the
global policy; supporting __domain__ is an example of such a change of
policy.  I also want to avoid having to reimplement the policy, with
all its warts, in freeze.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From Fredrik Lundh" <effbot@telia.com  Wed Feb 16 21:10:07 2000
From: Fredrik Lundh" <effbot@telia.com (Fredrik Lundh)
Date: Wed, 16 Feb 2000 22:10:07 +0100
Subject: [Import-sig] Re: Long-awaited imputil comments
References: <200002131705.MAA20578@eric.cnri.reston.va.us> <1261388683-4430097@hypernet.com>
Message-ID: <014901bf78c2$35503260$34aab5d4@hagrid>

> If a freeze mechanism is analyzing source, then it needs=20
> source, but I don't think that's necessary. Then only other=20
> reason I can see for wanting source

Which reminds me...  it would be nice if an import handler
can provide optional "find corresponding source" hooks for
traceback.py and friends.

(among other things, this would allow pythonworks to use
an archive file as the "workspace"...)

</F>

From gmcm@hypernet.com  Wed Feb 16 21:24:45 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 16 Feb 2000 16:24:45 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002162052.PAA29056@eric.cnri.reston.va.us>
References: Your message of "Wed, 16 Feb 2000 15:43:54 EST."             <1261388683-4430097@hypernet.com>
Message-ID: <1261386237-4574590@hypernet.com>

[Gordon]
> > If a freeze mechanism is analyzing source, then it needs 
> > source, but I don't think that's necessary. Then only other 
> > reason I can see for wanting source is if freeze is running with 
> > one magic number, but the frozen code will run with another, 
> > (to which I say "tough toenails, tootsie").

[Guido]
> Maybe my freezer wants to store the source as well as the code
> objects, so it can give decent tracebacks.
> 
> Which reminds me -- we need to introduce a standard API to retrieve
> the source for a module that's been imported (if it's available at
> all).  I can easily see how archives can be distributed containing
> both .pyc and .py files; the zip access module could easily find the
> .py file on request.

[Fredrik]
> Which reminds me...  it would be nice if an import handler
> can provide optional "find corresponding source" hooks for
> traceback.py and friends.
> 
> (among other things, this would allow pythonworks to use
> an archive file as the "workspace"...)

Hmm, wasn't there a reference earlier today to "You ain't 
gonna need it"?

Java doesn't do it.

You can already do it if you install source, then archive it, 
leaving the __file__ attribute alone - IDLE / Pythonwin will pop 
up the source.

Nobody sane is going to put code under active development in 
an archive.

A developer who wants run from an archive, yet see (but not 
alter) the source at a traceback can do as above (install 
source, then archive it).

Users who don't know and don't care can snip the traceback 
and send it to the developer, who can find the source.

Yeah, it can be supported, but Pythonworks is the only people 
who are going to use it, and the mad scientist can code it up 
in 10 minutes ;-). 

- Gordon

From guido@python.org  Wed Feb 16 21:47:34 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 16 Feb 2000 16:47:34 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 16:24:45 EST."
 <1261386237-4574590@hypernet.com>
References: Your message of "Wed, 16 Feb 2000 15:43:54 EST." <1261388683-4430097@hypernet.com>
 <1261386237-4574590@hypernet.com>
Message-ID: <200002162147.QAA29542@eric.cnri.reston.va.us>

> Hmm, wasn't there a reference earlier today to "You ain't 
> gonna need it"?

I claim we need it.

> Java doesn't do it.

So?

> You can already do it if you install source, then archive it, 
> leaving the __file__ attribute alone - IDLE / Pythonwin will pop 
> up the source.
> 
> Nobody sane is going to put code under active development in 
> an archive.

But there are other reasons why you would want to see tracebacks even
if you're not actively developing.

Plenty of people distribute mostly-working code to end users and ask
them to report tracebacks.  E.g. the Ultraseek product from Infoseek
(used for the python.org search) occasionally displays tracebacks.
The Zope guys also do this (they hide the traceback in an HTML comment
I believe, but it's there).

Sure, you can take a traceback without source lines and match up the
line numbers manually with your source, assuming you have the exact
version of the source -- but it's a pain.

> A developer who wants run from an archive, yet see (but not 
> alter) the source at a traceback can do as above (install 
> source, then archive it).

That's no option for distributions -- the archive is the only
distribution!

> Users who don't know and don't care can snip the traceback 
> and send it to the developer, who can find the source.

As I said, very inconvenient.

> Yeah, it can be supported, but Pythonworks is the only people 
> who are going to use it, and the mad scientist can code it up 
> in 10 minutes ;-). 

I didn't say I wanted *you* to code it.  I just said that I want the
API.  Accessing the source code is a common need in lots of places.
Adding the source to the archive is a nice solution.

Why don't you like it?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Wed Feb 16 22:49:14 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 16 Feb 2000 14:49:14 -0800 (PST)
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002161357170.17758-100000@nebula.lyra.org>

On Wed, 16 Feb 2000, Guido van Rossum wrote:
>...
> > Understood. I think some of your concerns are based on its historical
> > cruftiness, rather than where it "should" be. I'll prep a new version with
> > the backwards compat pulled out. People that relied on the old version can
> > continue to use their older version (which is Public Domain, so they're
> > quite free to do so :-)
> 
> OK, I'm awaiting a new imputil announcement.  (You should really have
> a webpage for it pointing both to the old and the new version.)

I do. http://www.lyra.org/greg/python/. The stable module is provided, and
the latest is available via ViewCVS (linked from the imputil section on
the page).

>...
> Agreed.  It sounds like we're stuck with overriding
> __builtin__.__import__, like before.

Yes. I'll rebuild around this. If somebody comes up with a new/better
design at some point, then we switch.

> > BUT: we may be able to design an RExecImportManager that remembers
> > restricted modules and uses different behavior when it determines the
> > import context is one of those modules. I haven't thought on this to
> > determine is true viability, tho...
> 
> Sounds tricky -- *all* modules imported in restricted mode must be
> treated differently.

Yah. We'll just use the __import__ thing. It will allow multiple
ImportManager objects to exist.

>...
> If we let __domain__ be initialized by the importer but overridden by
> the package, we can do everything we need.

All right. I'll look at adding that.

>...
> > >         Looking forward to _FilesystemImporter: I want to be able to
> > >         have the *name* of a zip file (or other archive, whatever is
> > >         supported) in sys.path; that seems more convenient for the
> > > 	user and simplifies $PYTHONPATH processing.
> > 
> > Hmm. This will complicate things quite a bit, but is doable. It will also
> > increase the processing time for sys.path elements.
> 
> We could cache this in a dictionary: the ImportManager can have a

Sounds like a good plan. I'll add this.

>...
> > >     def import_top():
> > > 
> > >         This appears a hook that a base class can override; but why?
> > 
> > It is for use by clients of the Importer. In particular, by the
> > ImportManager. It is not intended to be overridden.
> 
> Are there any other clients of the Importer class?  Since import_top()
> simply calls _import_one(), do we even need it?

Manual invocation. This code works:

-------------------------
fetch = my_imp_tools.HTTPImporter("...")
qp_xml = fetch.import_top("qp_xml")
-------------------------

In other words: it is entirely possible to use an Importer without it
being installed into an ImportManager.

>...
> Then use a different word -- "private" has a well-defined meaning for
> C++ and Java programmers.  To me, "internal" sounds better.

I'll comment/doc appropriately.

>...
> But the importer is the wrong place to change the policy globally.
> The example is walk-me-up-Scotty: to implement that (or, more
> generally, to implement the __domain__ hook) without editing
> imputil.py, Marc-Andre would have to have to subclass all the
> importers that are used.  If the policy was embodied in the
> ImportManager class, he could subclass the ImportManager, install it
> instead of the default one, but continue to use the existing importers
> (e.g. to import from zip files).

You're falling right back into the classic import hook problem. If MAL
alters ImportManager and installs it, then he blows away whatever TP has
done.

The only time that anybody should ever consider modifying or subclassing
ImportManager:

1) An rexec-like environment. This is allowed because you are installing
   the new ImportManager into a specific namespace, rather than
   __builtin__.__import__

2) A shipping application. This is allowed because it's your app :-). The
   app is fully self-contained and all imports are known ahead of time.

   [ if your app is going to import unknown third-party code, then we're
     still okay, as that third-party stuff will be in an rexec
     environment, or it won't fall into the "I'm an app" category ]

Given that modules/packages should not be altering ImportManager at any
point, then any policy changes must go elsewhere. I claim that is the
Importer that is managing that package. Since a package is normally
imported and managed only *one* Importer, then the problem falls down to
altering the Importer used for the package while it is loading.

This is doable:

1) import foo.bar.baz is executed

2) ImportManager locates the package via an Importer. That Importer
   calls get_code() (or import_from_dir() for the _FilesystemImporter)

3) The Importer loads the package code object from wherever.
   (_FilesystemImporter loads the code object for __init__ and returns it)

4) The Importer creates a module object and stores __importer__ into it,
   pointing at <self>.

5) The code object is executed, overwriting __importer__.

   (it is also possible to do this by returning a new value in result[2]
    of the get_code() call, but this scenario doesn't have a custom
    Importer installed yet that would do this)

6) The Importer finishes loading the package and returns to the
   ImportManager.

7) The ImportManager calls _finish_import on the Importer found in the
   package module's __importer__ attribute (the custom Importer)

8) The custom Importer Does Its Thing

In essence, people should be highly discouraged from ever touching
ImportManager.

The particular __domain__ thing can be defined as a package-private
modification to the import process (for that package only!), thus the
package should fix up the Importer used for itself. Specifically, the MAL
and TP "import style" is implemented by overriding the algorithm in
Importer._do_import().

>...
> I know you can't do it fully automatically.  But I still want to be
> able to reuse as much of existing importer and importmanager classes
> as possible.  Currently, Tools/freeze/modulefinder.py contains a lot
> of code that reimplements the entire package resolution mechanism.
> It should really be able to reuse all that -- so that e.g. the
> addition of a __domain__ feature doesn't require changes both to
> imputil.py and to modulefinder.py.

All right. I'll see if I can come up with something for this.

> > That said: I wouldn't be opposed to adding a get_source() method to an
> > Importer. If the source is available, then it can return it (it may not be
> > available in an archive!).
> 
> How would you invoke it?  I would need to have a different call into
> the ImportManager that would invoke get_source() rather than
> get_code().

Yes: a different call into the ImportManager.

I'll do the get_source thing as a second step (possibly as a subclass, as
you mentioned). First is to fold in the rest of the feedback.

>...
> > >         I've noticed that all implementations of get_code() start with
> > >         a test whether parent is None or not, and branch to completely
> > >         different code.  This suggests that we might have two separate
> > >         methods???
> > 
> > Interesting point. I hadn't noticed that. They don't always branch to
> > different points, however: consider DirectoryImporter and FuncImporter.
> 
> But those are the most trivial examples.

So? That doesn't negate their use as an example.

class HTTPImporter:
  def __init__(self, url):
    self.url = url
  def get_code(self, parent, modname, fqname):
    if parent:
      url = parent.__url__
    else:
      url = self.url
    # look for <modname> at <url>

Granted, we could also rewrite the "look for" part as a method which is
called by get_subcode() and get_topcode().

> > Essentially, we have two forms of calls:
> > 
> >     get_code(None, modname, modname)     # look for a top-level module
> >     get_code(parent, modname, fqname)    # look in a package for a module
> > 
> > imputil has been quite nice because you only had to worry about one hook,
> > but separating these might be a good thing to do. My preference is to
> > leave it as one hook, but let's see what others have to say...
> 
> You might provide a default implementation for get_code() that calls
> either get_subcode() or get_topcode() depending on whether parent is
> None; then subclasses can separately override those, or override
> get_code() when it's more convenient.

Sure.

>...
> > I'll also move the Importer subclasses to a new file for placement under
> > Demo/. That should trim imputil.py down quite a lot.
> 
> Except for the _FilesystemImporter class, I presume, which is needed
> in normal use.

Yes.

>...
> I mean that you have to do
> 
>   codestring = re.sub(r"\r\n", r"\n", codestring)
> 
> on the code string after reading it.  This has nothing to do with the

Ah! Okay... not a problem.

It would be nice to invoke the compiler on an open file object. That would
obviate this problem entirely.

I think that I'll look into doing a patch for this, rather than using
re.sub().

>...
> Unix tools shouldn't insist on it being absent.  Fixing compile() is
> hard, unfortunately, hence this request for a workaround.

If I can't figure out a way to do it, then I'll fall back to re.sub() :-)

>...
> > It would be nice if there was a "shared"
> > mode for reading. How are two writers and a reader different from a single
> > writer and a single reader?
> 
> OK, I'll explain the problem.  (Cut from a mail explaining it to
> Jeremy:)
>...
> | I have devised the following solution (which may even work on
> | Windows) but not yet implemented it:
> | 
> | when writing the .pyc file, use unlink() to remove the .pyc file and
> | then use low-level open() with the proper flags to require that the
> | file doesn't yet exist (O_EXC:?); then use fdopen().  If the open()
> | fails, don't write (treat it the same as a failing fopen() now.)
> | 
> | (You'd think that you could use a temporary file, but it's hard to
> | come up with a temp filename that's unique -- and if it's not unique,
> | the same race condition could still happen.)

Consider it fixed.

>...
> OK.  Let's table this one until we feel we know how to refactor os.py.
> (Maybe a platform-specific os.py could be frozen into the
> interpreter.)

I'll leave appropriate comments in the source as a reminder.

> > > class SuffixImporter:
> > > 
> > >     Why is this a class at all?  It would seem to be sufficient to
> > >     have a table of functions instead of a table of instances.  These
> > >     importers have no state and only one method that is always
> > >     overridden.
>...
> > I used instances because I wasn't sure what all we might want in there. If
> > we don't add any other methods or attributes to the public interface, then
> > yah: we could switch to a function-based approach.
> 
> See http://c2.com/cgi-bin/wiki?YouArentGonnaNeedIt (and the
> rest of this wikiweb on refactoring, patterns etc.) for why you
> shouldn't plan ahead this far.

I wasn't planning far ahead at all. Just banging out some code :-) Now
that that piece is (ahem) done, I'll revise the use of ".import_file".

And yes... YouArentGonnaNeedIt is a very familiar mantra to me. You should
have been at MSFT with me to see how many times I wielded that bat against
the developers :-)
[ the mantra applies whole-heartedly to Python; it gets a little less
  rigid when you're talking about hard-to-maintain languages like C++ ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Wed Feb 16 23:05:11 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 16 Feb 2000 15:05:11 -0800 (PST)
Subject: [Import-sig] freezing (was: Long-awaited imputil comments)
In-Reply-To: <1261388683-4430097@hypernet.com>
Message-ID: <Pine.LNX.4.10.10002161450050.17758-100000@nebula.lyra.org>

On Wed, 16 Feb 2000, Gordon McMillan wrote:
>...
> [Greg]
> > You're asking for something that is impossible in practice. Consider the
> > following code fragment:
> > 
> > ------------------------
> > import my_imp_tools
> > my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/")
> > 
> > import qp_xml
> > ------------------------
> > 
> > There is no way that you can create a freeze tool that will be able to
> > manage this kind of scenario.
> 
> Sure. Dynamically replace the Importer in __bases__ with a 
> hacked one that doesn't affect sys.modules, grabs the code 
> object and analyzes byte code (like modulefinder does) to find 
> further imports.

The HTTPImporter is parameterized -- analyzing bytecode or a parse tree
won't discover those parameter values (without a lot of work).

You'd have to run the code to get the Importers instantiated and
installed, but then you could have a problem with code that is executing
outside of a classdef or funcdef.

Guido suggested a custom ImportManager, but that would run into the same
kind of problem.

Effectively, I think what needs to happen is that the freeze tool
understands Importer classes. The configuration input to the tool would
specify how to set up the Importers, which the tool would directly query.
In other words... it would still be a pretty custom approach *if* the
application uses any custom import stuff.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gmcm@hypernet.com  Wed Feb 16 23:03:20 2000
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 16 Feb 2000 18:03:20 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002162147.QAA29542@eric.cnri.reston.va.us>
References: Your message of "Wed, 16 Feb 2000 16:24:45 EST."             <1261386237-4574590@hypernet.com>
Message-ID: <1261380325-4930086@hypernet.com>

[source in archives]
> > Java doesn't do it.
> 
> So?

So if there were demand for it, I would expect JavaSoft to 
invest web real estate in describing it as a feature. They don't. 

> But there are other reasons why you would want to see tracebacks even
> if you're not actively developing.
> 
> Plenty of people distribute mostly-working code to end users and ask
> them to report tracebacks.  E.g. the Ultraseek product from Infoseek
> (used for the python.org search) occasionally displays tracebacks.
> The Zope guys also do this (they hide the traceback in an HTML comment
> I believe, but it's there).
> 
> Sure, you can take a traceback without source lines and match up the
> line numbers manually with your source, assuming you have the exact
> version of the source -- but it's a pain.

It's an inconvenience which I think will cause far less pain and 
suffering than you're predicting.

I can't double click in my browser and go to the source, nor in 
an email containing a traceback. So unless I'm intimately 
familiar with the bug, I'll be entering line numbers into my 
editor anyway.

Let them distribute alphas and betas in source form if that's a 
problem. They'll still enter line numbers into their editors.

> > A developer who wants run from an archive, yet see (but not 
> > alter) the source at a traceback can do as above (install 
> > source, then archive it).
> 
> That's no option for distributions -- the archive is the only
> distribution!

Archives aren't a convenience for distribution - you zip / tgz 
anyway. 

They're only a minor aid in installing (unless you're talking 
about a "freeze" type situation, in which case you almost 
certainly don't want source) - it's that much less you need to 
unpack, but you'll almost certainly be uncompressing and 
unpacking anyway - even if just to get to the README. 

We've already thrown "disk space" out, since zlib isn't 
everywhere available. 

That leaves speed. We've interfered with that by adopting a 
complex file format, but I can buy the reasoning - the 
existance of tools.

> > Users who don't know and don't care can snip the traceback 
> > and send it to the developer, who can find the source.
> 
> As I said, very inconvenient.
> 
> > Yeah, it can be supported, but Pythonworks is the only people 
> > who are going to use it, and the mad scientist can code it up 
> > in 10 minutes ;-). 
> 
> I didn't say I wanted *you* to code it.  I just said that I want the
> API.  Accessing the source code is a common need in lots of places.
> Adding the source to the archive is a nice solution.
> 
> Why don't you like it?

It complicates based on a predicted need that I think is 
inaccurate. I've got a file folder of nearly 500 msgs about my 
installer, and not one mentions lack of access to source on a 
traceback as a problem. Yes, that's "different", because it's a 
freeze like situation, and people don't make that complaint 
about freeze, either.

Which takes me back to Java as a real life example.

I say the only people who would be bothered are developers 
using archives - and as developers, they have easy ways of 
dealing with it.

OK, I'm being irate. No, it's not that big a deal. Maybe by 
Py3K we'll have agreed on what exception to raise when 
get_source fails...

- Gordon

From gstein@lyra.org  Wed Feb 16 23:13:03 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 16 Feb 2000 15:13:03 -0800 (PST)
Subject: [Import-sig] fetching source (was: Long-awaited imputil comments)
In-Reply-To: <200002162052.PAA29056@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002161508310.17758-100000@nebula.lyra.org>

On Wed, 16 Feb 2000, Guido van Rossum wrote:
>...
> Which reminds me -- we need to introduce a standard API to retrieve
> the source for a module that's been imported (if it's available at
> all).  I can easily see how archives can be distributed containing
> both .pyc and .py files; the zip access module could easily find the
> .py file on request.

We could do something like the following:

   source = module.__importer__.get_module_source(module)

Note that we also have:

   source = importer.get_source(parent, modname, fqname)

Something along those lines...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From guido@python.org  Thu Feb 17 14:46:22 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 17 Feb 2000 09:46:22 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 18:03:20 EST."
 <1261380325-4930086@hypernet.com>
References: Your message of "Wed, 16 Feb 2000 16:24:45 EST." <1261386237-4574590@hypernet.com>
 <1261380325-4930086@hypernet.com>
Message-ID: <200002171446.JAA00480@eric.cnri.reston.va.us>

OK, OK.  No need to get all wound up about it.  I'll stop now, after
this:

I don't mind that the implementation for get_source() raises an error
(any error) when the code came from an archive.  I just want a
standard API that people who write alternative code repositories
can implement.  Greg's proposal seems fine:
module.__importer__.get_module_source(module).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From Fredrik Lundh" <effbot@telia.com  Thu Feb 17 14:59:15 2000
From: Fredrik Lundh" <effbot@telia.com (Fredrik Lundh)
Date: Thu, 17 Feb 2000 15:59:15 +0100
Subject: [Import-sig] Re: Long-awaited imputil comments
References: Your message of "Wed, 16 Feb 2000 15:43:54 EST."             <1261388683-4430097@hypernet.com>  <1261386237-4574590@hypernet.com>
Message-ID: <002e01bf7957$9598d4c0$34aab5d4@hagrid>

Gordon wrote:
> Java doesn't do it.

so?

> You can already do it if you install source, then archive it,=20
> leaving the __file__ attribute alone - IDLE / Pythonwin will pop=20
> up the source.

pythonworks users install pythonworks in a directory of their
own choosing.  they don't necessarily install source, archive it,
and keep the source files around in the file system.

> Nobody sane is going to put code under active development in=20
> an archive.

I didn't say that, did I?  Just said that *I* thought
it was a good idea ;-)

(if it makes you feel better about the idea, replace
the word "archive" with "database").

> Yeah, it can be supported, but Pythonworks is the only people=20
> who are going to use it, and the mad scientist can code it up=20
> in 10 minutes ;-).=20

sure, but then he has to ship a custom python library (and
a custom interpreter, for that matter).

</F>

From guido@python.org  Thu Feb 17 14:56:32 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 17 Feb 2000 09:56:32 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Wed, 16 Feb 2000 14:49:14 PST."
 <Pine.LNX.4.10.10002161357170.17758-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.10002161357170.17758-100000@nebula.lyra.org>
Message-ID: <200002171456.JAA00490@eric.cnri.reston.va.us>

> You're falling right back into the classic import hook problem. If MAL
> alters ImportManager and installs it, then he blows away whatever TP has
> done.

Fair enough.

I believe I was thinking of exactly the situations you were mentioning
later: in legitimate situations where the policy needs to be changed,
such as rexec or a closed app, it would be helpful if the policy was
all implemented as part of ImportManager.

The importers shouldn't typically deal with the policy: they are there
to deal with the intricacies of importing code from a particular
archive format, or from the web (e.g. webDAV :-), or from a database
or version control management system.

If I have a legitimate situation (see above) where I need to change
the policy, I want to be able to subclass one class.  With the current
architecture, I would need to subclass each of the importer classes
that I am using to change the policy.  Instead, I want to be able to
change the ImportManager and hook it up with the existing importers.

(This also suggests that the relationship between the ImportManager
and the _FilesystemImporter should be more loosely coupled.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Thu Feb 17 17:20:21 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 17 Feb 2000 09:20:21 -0800 (PST)
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002171456.JAA00490@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002170909490.17758-100000@nebula.lyra.org>

On Thu, 17 Feb 2000, Guido van Rossum wrote:
>...
> If I have a legitimate situation (see above) where I need to change
> the policy, I want to be able to subclass one class.  With the current
> architecture, I would need to subclass each of the importer classes
> that I am using to change the policy.  Instead, I want to be able to
> change the ImportManager and hook it up with the existing importers.

All righty. It looks like we're in agreement on what "legitimate changes
to ImportManager" means. I'll try to capture the essence of this into some
doc/comments somewhere (definitely into doc when this is stable).

However, we still have a tension occurring here:

1) implementing policy in ImportManager assists in single-point policy
   changes for app/rexec situations
2) implementing policy in Importer assists in package-private policy
   changes for normal, operating conditions

I'll see if I can sort out a way to do this. Maybe the Importer class will
implement the methods (which can be overridden to change policy) by
delegating to ImportManager.

> (This also suggests that the relationship between the ImportManager
> and the _FilesystemImporter should be more loosely coupled.)

Per a suggestion from MAL, I'm going to allow a user to pass <fs_imp> at
ImportManager construction time. If you write a custom ImportManager, then
you can pass in your own fs_imp when you instantiate it. I'll also move
the default class (_FilesystemImporter) into a class variable.

Is that the uncoupling you were thinking of?  (we're also uncoupling the
suffixes stuff somewhat)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From guido@python.org  Thu Feb 17 17:27:54 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 17 Feb 2000 12:27:54 -0500
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: Your message of "Thu, 17 Feb 2000 09:20:21 PST."
 <Pine.LNX.4.10.10002170909490.17758-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.10002170909490.17758-100000@nebula.lyra.org>
Message-ID: <200002171727.MAA01339@eric.cnri.reston.va.us>

> However, we still have a tension occurring here:
> 
> 1) implementing policy in ImportManager assists in single-point policy
>    changes for app/rexec situations
> 2) implementing policy in Importer assists in package-private policy
>    changes for normal, operating conditions
> 
> I'll see if I can sort out a way to do this. Maybe the Importer class will
> implement the methods (which can be overridden to change policy) by
> delegating to ImportManager.

Maybe also think about what kind of policies an Importer would be
likely to want to change.  I have a feeling that a lot of the code
there is actually not so much policy but a *necessity* to get things
working given the calling conventions for the __import__ hook: whether
to return the head or tail of a dotted name, or when to do the "finish
fromlist" stuff.

> > (This also suggests that the relationship between the ImportManager
> > and the _FilesystemImporter should be more loosely coupled.)
> 
> Per a suggestion from MAL, I'm going to allow a user to pass <fs_imp> at
> ImportManager construction time. If you write a custom ImportManager, then
> you can pass in your own fs_imp when you instantiate it. I'll also move
> the default class (_FilesystemImporter) into a class variable.
> 
> Is that the uncoupling you were thinking of?  (we're also uncoupling the
> suffixes stuff somewhat)

Great!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gstein@lyra.org  Thu Feb 17 18:01:49 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 17 Feb 2000 10:01:49 -0800 (PST)
Subject: [Import-sig] Re: Long-awaited imputil comments
In-Reply-To: <200002171727.MAA01339@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002171000520.578-100000@nebula.lyra.org>

On Thu, 17 Feb 2000, Guido van Rossum wrote:
>...
> Maybe also think about what kind of policies an Importer would be
> likely to want to change.  I have a feeling that a lot of the code
> there is actually not so much policy but a *necessity* to get things
> working given the calling conventions for the __import__ hook: whether
> to return the head or tail of a dotted name, or when to do the "finish
> fromlist" stuff.

Agreed! Thanx for all the feedback.

Time to write some code...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Thu Feb 17 18:41:51 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 17 Feb 2000 10:41:51 -0800 (PST)
Subject: [Import-sig] getting source (was: Long-awaited imputil comments)
In-Reply-To: <200002171446.JAA00480@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002171036510.578-100000@nebula.lyra.org>

On Thu, 17 Feb 2000, Guido van Rossum wrote:
>...
> I don't mind that the implementation for get_source() raises an error
> (any error) when the code came from an archive.

I was thinking of returning None, to follow the get_code() pattern.

> I just want a
> standard API that people who write alternative code repositories
> can implement.  Greg's proposal seems fine:
> module.__importer__.get_module_source(module).

I also plan to have importer.get_source(parent, modname, fqname).

* get_module_source() is needed for things like tracebacks, where it
  somewhat difficult to recover parent/modname/fqname (also described as:
  why make clients repeat the code to recover that data)

* get_source() is needed for cases where a module hasn't been imported at
  that point

Importer subclasses will only need to implement get_source(). The base
class will extract the parent/modname/fqname from information in the
module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Fri Feb 18 14:25:21 2000
From: gstein@lyra.org (Greg Stein)
Date: Fri, 18 Feb 2000 06:25:21 -0800 (PST)
Subject: [Import-sig] (partially) updated imputil
Message-ID: <Pine.LNX.4.10.10002180619160.8706-100000@nebula.lyra.org>

I've made some more updates to imputil. Change log and the updated module
are available at:
  http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py

  (revisions 1.10 and 1.11)

I'm going to be out of town Saturday thru Tuesday. I'll probably do some
more work on imputil before I leave. Not sure if I'll work on it while I'm
gone (or on that LONG plane flight to Boston...)

Anyhow, there is still some more work queued up on it. I haven't made
myself an exhaustive list yet, so I can't list that here. But I'll
probably get that list done before leaving.

Cheers,
-g

p.s. the demo importers are now in a module named importers.py accessible
thru ViewCVS (from the link above, just navigate up one level)

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Sat Feb 19 13:50:13 2000
From: gstein@lyra.org (Greg Stein)
Date: Sat, 19 Feb 2000 05:50:13 -0800 (PST)
Subject: [Import-sig] Re: (partially) updated imputil
In-Reply-To: <Pine.LNX.4.10.10002180619160.8706-100000@nebula.lyra.org>
Message-ID: <Pine.LNX.4.10.10002190549070.8706-100000@nebula.lyra.org>

On Fri, 18 Feb 2000, Greg Stein wrote:
> I've made some more updates to imputil. Change log and the updated module
> are available at:
>   http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py
> 
>   (revisions 1.10 and 1.11)
> 
> 
> I'm going to be out of town Saturday thru Tuesday. I'll probably do some
> more work on imputil before I leave. Not sure if I'll work on it while I'm
> gone (or on that LONG plane flight to Boston...)
> 
> Anyhow, there is still some more work queued up on it. I haven't made
> myself an exhaustive list yet, so I can't list that here. But I'll
> probably get that list done before leaving.

I just checked in rev 1.12 which includes the TODO/wish list.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From gstein@lyra.org  Sat Feb 19 13:26:00 2000
From: gstein@lyra.org (Greg Stein)
Date: Sat, 19 Feb 2000 05:26:00 -0800 (PST)
Subject: [Import-sig] import "domain" question (was: Long-awaited imputil comments)
In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.10002190522070.8706-100000@nebula.lyra.org>

On Wed, 16 Feb 2000, Guido van Rossum wrote:
>...
> > I've been thinking of something along the lines of
> > _determine_import_context() returning a list of things to try. Default is
> > to return something like [current-context,] + sys.path (not exactly that,
> > but you get the idea). The _import_hook would then operate as a simple
> > scan over that list of places to attempt an import from. MAL could
> > override _determine_import_context() to return the walk-me-up, intervening
> > packages. Tim could just always return sys.path (and never bother trying
> > to determine the current context).
> 
> Yes.  In ni (remember ni?) we had this mechanism; it was called
> "domain" (not a great name for it).  The domain was a list of packages
> where relative imports were sought.  A package could set its domain by
> setting a variable __domain__.  The current policy (current package,
> then toplevel) corresponds to a 2-item domain: [<packagename>, ""]
> <(where "" stands for the unnamed toplevel package).
> Walk-me-up-Scotty corresponds to a domain containing the current
> package, its parent, its grandparent, and so on, ending with "".  The
> "no relative imports" policy is represented [""].
> 
> If we let __domain__ be initialized by the importer but overridden by
> the package, we can do everything we need.

How is this different from __path__ ??

Is it simply that __path__ refers to the filesystem, while __domain__
refers to the package namespace? If so, then that seems like duplicate
functionality. I can easily see constructing a __path__ using the __file__
attribute and "walking up the directory tree". Certainly a bit
nicer/faster than checking two "paths" inside the import system.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

From guido@python.org  Mon Feb 21 18:32:39 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 21 Feb 2000 13:32:39 -0500
Subject: [Import-sig] Re: import "domain" question (was: Long-awaited imputil comments)
In-Reply-To: Your message of "Sat, 19 Feb 2000 05:26:00 PST."
 <Pine.LNX.4.10.10002190522070.8706-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.10002190522070.8706-100000@nebula.lyra.org>
Message-ID: <200002211832.NAA03051@eric.cnri.reston.va.us>

[me]
> > Yes.  In ni (remember ni?) we had this mechanism; it was called
> > "domain" (not a great name for it).  The domain was a list of packages
> > where relative imports were sought.  A package could set its domain by
> > setting a variable __domain__.  The current policy (current package,
> > then toplevel) corresponds to a 2-item domain: [<packagename>, ""]
> > <(where "" stands for the unnamed toplevel package).
> > Walk-me-up-Scotty corresponds to a domain containing the current
> > package, its parent, its grandparent, and so on, ending with "".  The
> > "no relative imports" policy is represented [""].
> > 
> > If we let __domain__ be initialized by the importer but overridden by
> > the package, we can do everything we need.

[Greg]
> How is this different from __path__ ??
> 
> Is it simply that __path__ refers to the filesystem, while __domain__
> refers to the package namespace? If so, then that seems like duplicate
> functionality. I can easily see constructing a __path__ using the __file__
> attribute and "walking up the directory tree". Certainly a bit
> nicer/faster than checking two "paths" inside the import system.

No, no, no!

If you are looking for foo.py in sys.path, the full module name will
be "foo", no matter where you find it.  If you are looking for it in
various packages, the module name will be foo *prefixed with the name
of the package where you found it*!  This doesn't affect the importer
much (since they asked for it by foo anyway), but it greatly affects
the sys.modules administration, and it affect when you have hard name
conflicts.

--Guido van Rossum (home page: http://www.python.org/~guido/)