From barry at python.org  Wed Jun 22 18:39:08 2011
From: barry at python.org (Barry Warsaw)
Date: Wed, 22 Jun 2011 12:39:08 -0400
Subject: [Import-SIG] PEP 382 sprint
Message-ID: <20110622123908.4b65c1d5@neurotica.colubris.lan>

Hi folks,

Yesterday, 6 Washington DC area Pythonistas met to work on PEP 382.  I wrote
up a summary based on my notes and blogged about it here:

http://www.wefearchange.org/2011/06/pep-382-sprint-summary.html

Hopefully, the other participants will correct my mistakes and fill in the
holes.   A few other things to mention:

 * I resurrected the import-sig in order to shepherd PEP 382 to landing in
   Python 3.3.  If you're at all interested in helping out, please join the
   mailing list:

   http://mail.python.org/mailman/admin/import-sig

* We created a wiki page to track our results, questions, and plan of action:

  http://wiki.python.org/moin/Pep382Sprint

I want to thank my fellow sprint participants for coming out and really doing
a great job working on this.  And I especially want to thank the PSF for
sponsoring our sprint.  What a great way to encourage Python developers to
meet and work on improving Python!

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110622/09e426a8/attachment.pgp>

From brett at python.org  Wed Jun 22 22:53:55 2011
From: brett at python.org (Brett Cannon)
Date: Wed, 22 Jun 2011 13:53:55 -0700
Subject: [Import-SIG] Comments on the PEP 382 sprint blog post and wiki
Message-ID: <BANLkTi=W=bsCet+AYEB+r+qWSv+2Zp=xFWpGN6tanobcB84+jQ@mail.gmail.com>

I just finished reading Barry's blog post on the PEP 382 sprint and wanted
to make two comments.

One is about re-implementing zipimport in pure Python. I actually did a
simple implementation in my importers project:
http://code.google.com/p/importers/source/browse/importers/zip.py . It's not
optimized in any way (I have thought about this particular issue a couple of
times and always have issues deciding on how to handle caching of zipfile,
or at all), but it works (at least the last time I ran the tests =). So this
is definitely doable and a long-term goal for importlib as I want to replace
as much of the import-related code in the stdlib w/ cleaner, newer
implementations that live in the importlib namespace (e.g., zipimport,
py_compile, etc.).

Two is about the bootstrapping of importlib so as to not have to deal with
as much C code in import.c. This is my next big Python project and I
definitely want to see it happen. I have thought about this off and on for
years and have gotten as far as to create a bootstrap_importlib branch for
my hg.python.org/sandbox/bcannon repo (which is currently just an
out-of-date clone of cpython/default). I figure that getting the code loaded
can be done through using the freeze tool on importlib._bootstrap to make it
always available to the interpreter. The real sticking point in
bootstrapping though, is that importlib relies on _io which itself imports
'os' (and everything that 'os' imports). Now my hope is that in the
bootstrap code during interpreter startup the 'os' import can be postponed
until importlib is up and running, and then do a post-bootstrap init step
for _io that then import 'os' for its use (the _io module also uses the
locale module, but that is not pulled in by PyInit__io() ). Unfortunately I
have not had the time to see if postponing this import is feasible in terms
of continuing to allow _io to work for importlib; my hope is that
importlib's use of _io is simple enough it will be okay.

Anyway, if people want to help with the bootstrapping just let me know. I
will simply use this mailing list as the place to air ideas and issues that
I come across.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110622/99786ef4/attachment.html>

From cool-rr at cool-rr.com  Thu Jun 23 20:39:29 2011
From: cool-rr at cool-rr.com (cool-RR)
Date: Thu, 23 Jun 2011 20:39:29 +0200
Subject: [Import-SIG] Anyone wants to try solve a zipimport bug in PyPy?
Message-ID: <BANLkTim2YD9rCsGfxNEsSp1BN84=ySw=nw@mail.gmail.com>

Hello everyone,

I'm hoping that there's someone in this list who understands Python's
zipimport mechanism and has some free time to help with a bug in
zipimporting in PyPy.

This is the bug:

https://bugs.pypy.org/issue725 (Sorry for the self-signed SSL cert.)

Armin and I made a failing test case but I don't know anything about
Python's zipimport mechanism, so this is a shot in the dark: Does anyone
feel like trying to solve this bug?


Thanks,
Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110623/df48024e/attachment.html>

From brett at python.org  Fri Jun 24 01:31:22 2011
From: brett at python.org (Brett Cannon)
Date: Thu, 23 Jun 2011 16:31:22 -0700
Subject: [Import-SIG] Anyone wants to try solve a zipimport bug in PyPy?
In-Reply-To: <BANLkTim2YD9rCsGfxNEsSp1BN84=ySw=nw@mail.gmail.com>
References: <BANLkTim2YD9rCsGfxNEsSp1BN84=ySw=nw@mail.gmail.com>
Message-ID: <BANLkTi=K_vaadd0jvPWX+32ZTbpJAHR0LFrBAoJP=teoSxYcHA@mail.gmail.com>

Just so people know, the failure (according to the bug report) is relegated
to Windows.

On Thu, Jun 23, 2011 at 11:39, cool-RR <cool-rr at cool-rr.com> wrote:

> Hello everyone,
>
> I'm hoping that there's someone in this list who understands Python's
> zipimport mechanism and has some free time to help with a bug in
> zipimporting in PyPy.
>
> This is the bug:
>
> https://bugs.pypy.org/issue725 (Sorry for the self-signed SSL cert.)
>
> Armin and I made a failing test case but I don't know anything about
> Python's zipimport mechanism, so this is a shot in the dark: Does anyone
> feel like trying to solve this bug?
>
>
> Thanks,
> Ram.
>
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110623/655f8c3d/attachment.html>

From pje at telecommunity.com  Fri Jun 24 18:29:41 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 24 Jun 2011 12:29:41 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
Message-ID: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>

Do we really need to read .pth files in PEP 382?  If so, why?

In a common usecase for PEP 382 (large namespace packages like 
zope.*), there will be a long list of .pth files present, each of 
which contains only a '*', but which still must be opened and read by 
the implementation, while adding no new information.

However, if we had instead .ns files or .nspkg files or something 
like that, their mere *existence* could be construed as implying 
namespace-ness, and require no actual opens or reads.

If we separate the "this is a namespace" funtionality from "here are 
paths" functionality, ISTM that the "here are paths" functionality is 
already adequately met by the existing .pth machinery.  (As I'm not 
aware of any real-world use cases for pkgutils' .pkg files -- but 
perhaps someone can enlighten me on that?)

Another consequence of this change is that it would simplify the PEP 
302 extension: instead of asking importers or loaders for a path, one 
could simply ask the importer whether a namespace exists, e.g.:

    finder.namespace_exists(fullname)

Returning either a subpath to put in __path__, or None if the named 
package is not a namespace.

If a regular package is found before a namespace, the normal protocol 
operates.  if a namespace is found, walk all remaining finders, 
adding any non-None path entries returned by namespace_exists(), and 
also invoking the first loader returned by a namespace-supplying finder.

Something like:

     path_iter = iter(current_path) # sys.path or a pacakge.__path__

     for path_entry in path_iter:
         finder = get_importer(path_entry)

         # This 'if' block is the only addition to the existing loop:
         if hasattr(finder, 'namespace_exists'):
             subpath = finder.namespace_exists(fullname)
             if subpath is not None:
                 break  # go handle the nspkg case

         loader = finder.find_module(fullname)
         if loader is not None:
             return loader.load_module(fullname)
     else:
         raise ImportError

     # Ok, we have a namespace package, so handle it:
     module = sys.modules[fullname] = new.module(fullname)
     sys.namespace_packages.add(fullname)
     module.__path__ = [subpath]
     loader = finder.find_module(fullname)
     if loader is not None:
         loader.load_module(fullname)

     for path_entry in path_iter:  # resume iteration
         finder = get_importer(path_entry)
         if hasattr(finder, 'namespace_exists'):
             subpath = finder.namespace_exists(fullname)
             if subpath is not None:
                 if subpath not in module.__path__:
                     module.__path__.append(subpath)
                 if loader is None:
                     loader = finder.find_module(fullname)
                     if loader is not None:
                         loader.load_module(fullname)

There are some variations possible in this algorithm; you could for 
example roll the two loops into one, by using 'loader' and 'module' 
as flags.  But the modifications needed to PEP 302 loaders are 
minimal, almost trivial.

By comparison, the current proposal seems a bit overweight, 
considering that PEP 382 does not provide any use-case rationale for 
supporting anything besides '*' in .pkg files.  In fact, there 
doesn't seem to be any reason to put the '*' in the __path__ -- 
sys.namespace_packages suffices to indicate namespace-ness.  Code 
that wishes to extend existing namespace packages (e.g. setuptools) 
can simply perform the equivalent of the second loop above on any new 
path entries, for all entries in sys.namespace_packages.  (Well, not 
*all* entries, but all those that recursively yield reachable 
namespace additions.)

Thoughts, anyone?


From martin at v.loewis.de  Fri Jun 24 19:48:06 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 24 Jun 2011 19:48:06 +0200
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
Message-ID: <4E04CDD6.90102@v.loewis.de>

Am 24.06.2011 18:29, schrieb P.J. Eby:
> Do we really need to read .pth files in PEP 382?  If so, why?

"Do we really need?" - of course not; life goes on even without
Python at all. "Are there use cases for it?" I think so.

I think the same question can be asked about .pth files in the
first place: "do we really need them"? Certainly not; people could
have done everything they do even without .pth files.

In any case, the motivation for using .pth files in PEP 382
was that I considered a natural extension to the mechanism that
was already there.

The use case I could imagine is what I think was the original use
case for .pth files as well: allow users to contribute their own
stuff to a package without having to modify a central location. The
sysadmin could create a writable file /usr/lib/python.../zope/pje.pth,
and you could then add stuff to the zope package without having write
access to its package folder.

> If we separate the "this is a namespace" funtionality from "here are
> paths" functionality, ISTM that the "here are paths" functionality is
> already adequately met by the existing .pth machinery.

I agree that it would simplify the PEP to not have to look into the
contents of a .pth file. Before discussing what the implementation
would look like exactly, I'd rather first establish whether there
is agreement to drop the feature.

So: anybody opposed to not being able to specify the path of a package
in a declarative manner?

Regards,
Martin

P.S. FWIW, my approach to a checkmark for namespace-ness of a directory
would be to have a directory name extension, say .ns or .py, indicating
that this directory is a namespaced Python package - that would drop
the need for special files at all (I actually think I'd prefer .py over
.ns, despite the risk for confusion with Python source files).

From pje at telecommunity.com  Fri Jun 24 20:35:22 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 24 Jun 2011 14:35:22 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP  382?
Message-ID: <20110624183534.A14683A4093@sparrow.telecommunity.com>

At 07:48 PM 6/24/2011 +0200, Martin v. L?wis wrote:
>The use case I could imagine is what I think was the original use
>case for .pth files as well: allow users to contribute their own
>stuff to a package without having to modify a central location. The
>sysadmin could create a writable file /usr/lib/python.../zope/pje.pth,
>and you could then add stuff to the zope package without having write
>access to its package folder.

But as long as zope is flagged as a namespace package, those 
additions can be anywhere on sys.path (using an appropriate 
subdirectory structure), so having .pth file contents doesn't really 
add anything to this use case.

(Except the ability to have more-obscure directory names and to make 
it harder to find out where a module is being imported from, of 
course.  But I'm not sure that's really a feature. ;-) )


>P.S. FWIW, my approach to a checkmark for namespace-ness of a directory
>would be to have a directory name extension, say .ns or .py, indicating
>that this directory is a namespaced Python package - that would drop
>the need for special files at all (I actually think I'd prefer .py over
>.ns, despite the risk for confusion with Python source files).

The main reason I favor files over a directory name change is 
backward compatibility and ease of extension/upgrade.  With flag 
files, existing tools need only add a flag file to be conformant, and 
if they contain an __init__.py, then they are still compatible with 
older Pythons.

(And even if we did use a directory name, I think .py would introduce 
problems in existing code that globs for *.py files and tries to read them.)


From barry at python.org  Fri Jun 24 22:51:23 2011
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2011 16:51:23 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
Message-ID: <20110624165123.7c73647f@neurotica.wooz.org>

On Jun 24, 2011, at 12:29 PM, P.J. Eby wrote:

>Do we really need to read .pth files in PEP 382?  If so, why?

It's an excellent question, Philip.  I know for the primary use case *I* care
about, the contents are really of no additional benefit.  And import.c does
carry a lot of extra complexity to read, parse, and combine them.  I suspect
that the vast majority of uses will have a single '*' in it and nothing more.

I also think the fact that they're almost, but not quite, like site.py .pth
files, share no common implementation in the core, but do share a file suffix
is confusing.

>In a common usecase for PEP 382 (large namespace packages like zope.*), there
>will be a long list of .pth files present, each of which contains only a '*',
>but which still must be opened and read by the implementation, while adding
>no new information.

And while we suspect that stat calls will dominate, opening, parsing, and
collating the file contents when it will almost always be just a '*' seems
wasteful.

Unlike traditional .pth files, lines starting with 'import' are not supported
in PEP 382.  That alone makes me uncomfortable with calling them `.pth`
files.  But like traditional .pth files, any line that the parser doesn't
recognize (i.e. that doesn't start with a # or is a single *) is taken as a
file system path, which itself must be stat'd before either being added or
discarded.  What's the point?

>However, if we had instead .ns files or .nspkg files or something like that,
>their mere *existence* could be construed as implying namespace-ness, and
>require no actual opens or reads.

+1.  Can we call YAGNI on anything else?  Certainly it would be easy enough to
extend the syntax of .ns files (or whatever color the bikeshed ends up getting
paint) should the need arise.  But if contents are meaningful from the start,
even if in practice it's never used, it'll be baggage we have to carry along
forever.

>If we separate the "this is a namespace" funtionality from "here are paths"
>functionality, ISTM that the "here are paths" functionality is already
>adequately met by the existing .pth machinery.  (As I'm not aware of any
>real-world use cases for pkgutils' .pkg files -- but perhaps someone can
>enlighten me on that?)
>
>Another consequence of this change is that it would simplify the PEP 302
>extension: instead of asking importers or loaders for a path, one could
>simply ask the importer whether a namespace exists, e.g.:
>
>    finder.namespace_exists(fullname)
>
>Returning either a subpath to put in __path__, or None if the named package
>is not a namespace.

While I support a simpler (and frankly, optional) additional API requirement
on PEP 302, I'd expect a function such as `namespace_exists()` to return a
pure boolean.  If it must return a subpath, then I'd prefer some other name.
Or maybe the __path__ could be passed in, allowing the finder do muck with it
however it wants.  Too dangerous?

>By comparison, the current proposal seems a bit overweight, considering that
>PEP 382 does not provide any use-case rationale for supporting anything
>besides '*' in .pkg files.  In fact, there doesn't seem to be any reason to
>put the '*' in the __path__ -- sys.namespace_packages suffices to indicate
>namespace-ness.  Code that wishes to extend existing namespace packages
>(e.g. setuptools) can simply perform the equivalent of the second loop above
>on any new path entries, for all entries in sys.namespace_packages.  (Well,
>not *all* entries, but all those that recursively yield reachable namespace
>additions.)
>
>Thoughts, anyone?

+1 for throwing away anything in PEP 382 we don't absolutely need.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110624/497df93d/attachment.pgp>

From barry at python.org  Fri Jun 24 23:07:58 2011
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2011 17:07:58 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <4E04CDD6.90102@v.loewis.de>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
Message-ID: <20110624170758.1f04384c@neurotica.wooz.org>

On Jun 24, 2011, at 07:48 PM, Martin v. L?wis wrote:

>In any case, the motivation for using .pth files in PEP 382
>was that I considered a natural extension to the mechanism that
>was already there.

Except, they're not really an extension, but more of a narrowing of the
current conventions (i.e. they don't support `import` lines).

>The use case I could imagine is what I think was the original use
>case for .pth files as well: allow users to contribute their own
>stuff to a package without having to modify a central location. The
>sysadmin could create a writable file /usr/lib/python.../zope/pje.pth,
>and you could then add stuff to the zope package without having write
>access to its package folder.
>
>> If we separate the "this is a namespace" funtionality from "here are
>> paths" functionality, ISTM that the "here are paths" functionality is
>> already adequately met by the existing .pth machinery.
>
>I agree that it would simplify the PEP to not have to look into the
>contents of a .pth file. Before discussing what the implementation
>would look like exactly, I'd rather first establish whether there
>is agreement to drop the feature.
>
>So: anybody opposed to not being able to specify the path of a package
>in a declarative manner?

I'm not opposed to dropping this feature.

>P.S. FWIW, my approach to a checkmark for namespace-ness of a directory
>would be to have a directory name extension, say .ns or .py, indicating
>that this directory is a namespaced Python package - that would drop
>the need for special files at all (I actually think I'd prefer .py over
>.ns, despite the risk for confusion with Python source files).

I agree with PJE's follow up on favoring flag files over special directory
names.  I'd like for the flag files to be named something other than .pth
though.  .ns seems fine to me and doesn't have any obvious collisions:

http://en.wikipedia.org/wiki/List_of_file_formats_(alphabetical)#N

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110624/05d7bb5d/attachment-0001.pgp>

From pje at telecommunity.com  Fri Jun 24 23:15:04 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 24 Jun 2011 17:15:04 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP  382?
In-Reply-To: <20110624165123.7c73647f@neurotica.wooz.org>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<20110624165123.7c73647f@neurotica.wooz.org>
Message-ID: <20110624211517.26D403A4093@sparrow.telecommunity.com>

At 04:51 PM 6/24/2011 -0400, Barry Warsaw wrote:
>While I support a simpler (and frankly, optional) additional API requirement
>on PEP 302, I'd expect a function such as `namespace_exists()` to return a
>pure boolean.  If it must return a subpath, then I'd prefer some other name.
>Or maybe the __path__ could be passed in, allowing the finder do muck with it
>however it wants.  Too dangerous?

I'm fine with another name. (namespace_subpath(), perhaps?)  I've 
coded up a sketch of a 2.x implementation (enabled via "from pep382 
import meta_importer; meta_importer.install()") using that basic API 
approach, and it's pretty clean.

(I don't think manipulating the __path__ directly would add anything, 
though, because it would only ever contain *previous* entries, not 
the entire __path__.)


From barry at python.org  Fri Jun 24 23:18:26 2011
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2011 17:18:26 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP  382?
In-Reply-To: <20110624211517.26D403A4093@sparrow.telecommunity.com>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<20110624165123.7c73647f@neurotica.wooz.org>
	<20110624211517.26D403A4093@sparrow.telecommunity.com>
Message-ID: <20110624171826.572d6f0f@neurotica.wooz.org>

On Jun 24, 2011, at 05:15 PM, P.J. Eby wrote:

>I'm fine with another name. (namespace_subpath(), perhaps?)

Done!  Bikeshed painted! :)

>I've coded up a sketch of a 2.x implementation (enabled via "from pep382
>import meta_importer; meta_importer.install()") using that basic API
>approach, and it's pretty clean.

I would love to be able to back port this to Python 3.2.  Supporting clean
namespace packages in Python 2.{6,7} would be double win.

Do you want to push the code to Cheeseshop, or make a branch available
somewhere?

>(I don't think manipulating the __path__ directly would add anything, though,
>because it would only ever contain *previous* entries, not the entire
>__path__.)

Sounds good to me.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110624/6c7e0afa/attachment.pgp>

From pje at telecommunity.com  Sat Jun 25 00:24:41 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 24 Jun 2011 18:24:41 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <20110624171826.572d6f0f@neurotica.wooz.org>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<20110624165123.7c73647f@neurotica.wooz.org>
	<20110624211517.26D403A4093@sparrow.telecommunity.com>
	<20110624171826.572d6f0f@neurotica.wooz.org>
Message-ID: <20110624222454.4434C3A4093@sparrow.telecommunity.com>

At 05:18 PM 6/24/2011 -0400, Barry Warsaw wrote:
>On Jun 24, 2011, at 05:15 PM, P.J. Eby wrote:
> >I'm fine with another name. (namespace_subpath(), perhaps?)
>
>Done!  Bikeshed painted! :)
>
> >I've coded up a sketch of a 2.x implementation (enabled via "from pep382
> >import meta_importer; meta_importer.install()") using that basic API
> >approach, and it's pretty clean.
>
>I would love to be able to back port this to Python 3.2.

Well, my sketch is probably forward-portable, but my brain hasn't 
been ported to 3.x yet, so you'll have to translate it yourself.  ;-)


>Supporting clean namespace packages in Python 2.{6,7} would be double win.

I guess supporting it for 2.3, 2.4, and 2.5 would be triple win, then.  ;-)


>Do you want to push the code to Cheeseshop, or make a branch available
>somewhere?

When I say "sketch", I mean it's just some code typed into a file 
where I sketch bits of Python code - it's not been tested or even 
compiled, let alone run.  ;-)

I'll stick it in a pastebin for you, though; it's barely more than 150 lines:

    http://pastebin.com/uFQ9iwXQ

Some bits of the code are a little hairy, since they're basically 
dynamically adding namespace_subpath() to existing importer objects, 
or working around the fact that existing loaders expect that *they* 
will set the module __path__.  There's then some further hairiness 
created by the need to set __loader__ and __file__ correctly when the 
workaround for the preceding problem is applied.  ;-)

For Python 2.5+, this code can stand alone, but for 2.3 and 2.4 it 
uses pkg_resources from setuptools, as those versions of the Python 
stdlib don't include pkgutil.ImpImporter and pkgutil.get_importer.

Finally, note that the code is designed to be used with explicit 
registration by default: unless you call meta_importer.install(), 
namespace packages have to be registered with 
'pep382.register_namespace()'.  This is a compromise for import 
performance reasons, as the search for '.ns'-containing 
subdirectories adds extra stat() calls per sys.path entry per import, 
which can be hugely slow on a network.

Explicit registration avoids this overhead for anything that's not 
already known ahead of time to be a namespace package, and 
pkg_resources already uses explicit registration, so it's no big deal 
to hook it up there.  The plan would be for setuptools 0.7 to write 
.pth files that "import pep382; 
pep382.register_namespace('whatever')" in place of its existing 
__path__-hackery, as well as for setuptools' declare_namespace() et 
al to use pep382 code behind-the-scenes.

However, for non-setuptools uses, a simple "meta_importer.install()" 
suffices to enable the module in a fully PEP-compatible mode, at the 
cost of added stat() calls.  (The official PEP 382 implementation can 
avoid the extra stat() calls because it'll be directly modifying the 
stdlib importers, rather than working around them.)

Anyway... reminder: this code is an UNTESTED SKETCH with LIMITED 
DOCUMENTATION, and it might contain errors ranging from simple typos 
to subtle-yet-awful design flaws that won't surface for months or 
years.  You have been warned.  ;-)

(Hm, actually, I just thought of a possible bug/oversight -- this 
code doesn't either check or ensure that parent packages of a 
namespace package are registered as namespaces.  That's okay for 
setuptools' purposes, but might lead to incompatibilities between 
third-party code registering packages and third-party code inspecting 
namespace packages and expecting the "parent of a namespace is also a 
namespace" invariant to apply.)


From ncoghlan at gmail.com  Sat Jun 25 01:36:14 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Jun 2011 09:36:14 +1000
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <4E04CDD6.90102@v.loewis.de>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
Message-ID: <BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>

Just a comment on MvL's "user additions" use case: with the per-user site
packages dir, it should be easy for any user to extend any namespace
package. The security aspects should be OK, as I believe the installed
subpackages will take precedence.

--
Nick Coghlan (via Gmail on Android, so likely to be more terse than usual)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110625/cab84035/attachment.html>

From barry at python.org  Sun Jun 26 13:46:35 2011
From: barry at python.org (Barry Warsaw)
Date: Sun, 26 Jun 2011 12:46:35 +0100
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
	<BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>
Message-ID: <20110626124635.19fca00c@snowdog>

On Jun 25, 2011, at 09:36 AM, Nick Coghlan wrote:

>Just a comment on MvL's "user additions" use case: with the per-user site
>packages dir, it should be easy for any user to extend any namespace
>package. The security aspects should be OK, as I believe the installed
>subpackages will take precedence.

But does anybody actually do this?

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110626/41cd8d8d/attachment.pgp>

From ncoghlan at gmail.com  Sun Jun 26 13:51:10 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 26 Jun 2011 21:51:10 +1000
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <20110626124635.19fca00c@snowdog>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
	<BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>
	<20110626124635.19fca00c@snowdog>
Message-ID: <BANLkTi=f87jnG64Vr-QgUUHPK4u7CA7upg@mail.gmail.com>

On Sun, Jun 26, 2011 at 9:46 PM, Barry Warsaw <barry at python.org> wrote:
> On Jun 25, 2011, at 09:36 AM, Nick Coghlan wrote:
>
>>Just a comment on MvL's "user additions" use case: with the per-user site
>>packages dir, it should be easy for any user to extend any namespace
>>package. The security aspects should be OK, as I believe the installed
>>subpackages will take precedence.
>
> But does anybody actually do this?

I have no idea - I was merely pointing out that even that hypothetical
use case is still covered with the no-content approach.

The updated PEP should be explicit about whether or not the marker
file is required to be empty, or if the package is allowed to use it
to store arbitrary data, though.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From barry at python.org  Sun Jun 26 13:53:54 2011
From: barry at python.org (Barry Warsaw)
Date: Sun, 26 Jun 2011 12:53:54 +0100
Subject: [Import-SIG] Do we really need to read .pth files in PEP 382?
In-Reply-To: <BANLkTi=f87jnG64Vr-QgUUHPK4u7CA7upg@mail.gmail.com>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
	<BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>
	<20110626124635.19fca00c@snowdog>
	<BANLkTi=f87jnG64Vr-QgUUHPK4u7CA7upg@mail.gmail.com>
Message-ID: <20110626125354.2b61372f@snowdog>

On Jun 26, 2011, at 09:51 PM, Nick Coghlan wrote:

>The updated PEP should be explicit about whether or not the marker
>file is required to be empty, or if the package is allowed to use it
>to store arbitrary data, though.

My preference would be for the PEP to state that the contents are ignored.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/import-sig/attachments/20110626/afbfc565/attachment.pgp>

From pje at telecommunity.com  Mon Jun 27 03:51:00 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 26 Jun 2011 21:51:00 -0400
Subject: [Import-SIG] Do we really need to read .pth files in PEP  382?
In-Reply-To: <20110626125354.2b61372f@snowdog>
References: <20110624163002.7B38E3A4093@sparrow.telecommunity.com>
	<4E04CDD6.90102@v.loewis.de>
	<BANLkTinCvTFNk4kv9N3bV58hLYXswYXZtA@mail.gmail.com>
	<20110626124635.19fca00c@snowdog>
	<BANLkTi=f87jnG64Vr-QgUUHPK4u7CA7upg@mail.gmail.com>
	<20110626125354.2b61372f@snowdog>
Message-ID: <20110627015059.75E8D3A403A@sparrow.telecommunity.com>

At 12:53 PM 6/26/2011 +0100, Barry Warsaw wrote:
>On Jun 26, 2011, at 09:51 PM, Nick Coghlan wrote:
> >The updated PEP should be explicit about whether or not the marker
> >file is required to be empty, or if the package is allowed to use it
> >to store arbitrary data, though.
>
>My preference would be for the PEP to state that the contents are ignored.

I'd be more inclined to say it's supposed to be an empty file, or 
contain only whitespace.  (That way, if we allow it to be something 
else in the future, we're starting from a known state as far as tools 
trying to read it.)

Also, I'm at least somewhat inclined to mandate a naming convention 
for the files, based on distribution names, but I'd be okay with just 
making that a recommendation rather than a requirement.


From ericsnowcurrently at gmail.com  Mon Jun 27 04:52:52 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sun, 26 Jun 2011 20:52:52 -0600
Subject: [Import-SIG] PEP 382 as an import hook
Message-ID: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>

Would a PEP 302 import hook to implement PEP 382 be too inefficient?
What about just for testing out the PEP?  Maybe I'm missing something,
but it seems like it should be doable and would enable PEP 382 for
2.x.

It would have to:

1. handle adding the find_path and load_module_with_path calls,
2. handle augmenting importlib's _DefaultPathFinder or an equivalent
to handle the .pth files.

I'll probably take a stab at writing one up in the next week or two,
regardless.  It's actually a pretty similar use case to another
project I am working on.  I'll admit that without importlib the second
part will require some messiness.  The import engine would be helpful
too.

Alternately you could replace sys.metapath with a subclass of list
that effectively wraps each loader added to it with similar
functionality to the above finder.

-eric

From martin at v.loewis.de  Mon Jun 27 10:07:02 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 27 Jun 2011 10:07:02 +0200
Subject: [Import-SIG] PEP 382 as an import hook
In-Reply-To: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>
References: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>
Message-ID: <4E083A26.7070500@v.loewis.de>

Am 27.06.2011 04:52, schrieb Eric Snow:
> Would a PEP 302 import hook to implement PEP 382 be too inefficient?
> What about just for testing out the PEP? 

I think everybody agrees that it would be desirable to have it - it's
just that nobody has managed so far to implement, as it is really really
difficult.

Regards,
Martin

From pje at telecommunity.com  Mon Jun 27 20:35:12 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 27 Jun 2011 14:35:12 -0400
Subject: [Import-SIG] PEP 382 as an import hook
In-Reply-To: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>
References: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>
Message-ID: <20110627183506.F1E793A4100@sparrow.telecommunity.com>

At 08:52 PM 6/26/2011 -0600, Eric Snow wrote:
>Would a PEP 302 import hook to implement PEP 382 be too inefficient?
>What about just for testing out the PEP?  Maybe I'm missing something,
>but it seems like it should be doable and would enable PEP 382 for
>2.x.
>
>It would have to:
>
>1. handle adding the find_path and load_module_with_path calls,
>2. handle augmenting importlib's _DefaultPathFinder or an equivalent
>to handle the .pth files.
>
>I'll probably take a stab at writing one up in the next week or two,
>regardless.

If you want it for Python 2.x, take a look at the sketch I linked here:

   http://mail.python.org/pipermail/import-sig/2011-June/000198.html

It'd probably port cleanly to 3.x, too, if that's what you want it for. 


From pje at telecommunity.com  Mon Jun 27 20:41:20 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 27 Jun 2011 14:41:20 -0400
Subject: [Import-SIG] PEP 382 as an import hook
In-Reply-To: <4E083A26.7070500@v.loewis.de>
References: <BANLkTinzZ1nnZasH5Dord=raLej+KxpaUA@mail.gmail.com>
	<4E083A26.7070500@v.loewis.de>
Message-ID: <20110627184112.48BD13A4100@sparrow.telecommunity.com>

At 10:07 AM 6/27/2011 +0200, Martin v. L?wis wrote:
>Am 27.06.2011 04:52, schrieb Eric Snow:
> > Would a PEP 302 import hook to implement PEP 382 be too inefficient?
> > What about just for testing out the PEP?
>
>I think everybody agrees that it would be desirable to have it - it's
>just that nobody has managed so far to implement, as it is really really
>difficult.

It's not *that* difficult.  You just need to use a meta hook, and pay 
some performance overhead.

The overhead comes in two places: first, you have to do an extra 
isdir() (or trap an exception from listdir()), and second, there's 
extra overhead before builtin/frozen imports, or before an 
ImportError is raised for missing modules.

If you are willing to pay those overheads (and the necessary extra 
code to support namespace checking for the builtin importer types), a 
sys.meta_path hook should be sufficient.  (Per my proof-of-concept.)


From pje at telecommunity.com  Mon Jun 27 22:13:34 2011
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 27 Jun 2011 16:13:34 -0400
Subject: [Import-SIG] Explicit membership in namespace packages
Message-ID: <20110627201327.3F6A73A4100@sparrow.telecommunity.com>

While reviewing the PEP 382 tests, I realized that there's something 
I've been misreading about the spec as written, because I had a 
different assumption about how things would/should work.

As written, the spec allows a namespace package to be declared in one 
place, and then files in any other matching directory are 
automatically included in that namespace.

But, in order to operate in a standalone fashion, any partial 
namespace package MUST contain its own .ns file to indicate that it 
is part of that namespace.  Otherwise, it could only be used on a 
sys.path where such a declaration already existed.

For consistency, therefore, I propose that we explicitly *require* a 
directory to include an .ns file in order to participate in a 
namespace, regardless of whether it also contains an __init__.py.

In other words, if a namespace package directory appears first on 
sys.path, then all and ONLY those sys.path subdirectories marked as 
namespace package directories will contribute modules to that package.

As it happens, I actually already wrote my meta-importer this way, 
because I was assuming that people would always NEED to mark their 
namespace package directories anyway (because they could otherwise 
not stand alone).

Anyway, I think that this, in combination with the importer protocol 
simplification and the flag file approach, makes the resulting 
implementation and application much easier to understand.  Basically 
the rules are:

* Directories containing one or more .ns files are "namespace package 
directories".  In this version of the spec, the files must be empty 
or contain only whitespace characters; future versions may specify 
optional additional information.

* Each namespace package directory should contain only unique 
filenames for that namespace, such that combining every namespace 
package directory with a given name results in no filename 
collisions.  This implies that modules, data files, AND .ns files 
must be given unique names.  (And generally, .ns files should be 
given a name based on their project's distribution name, to identify 
the source of the files.)

* When a namespace package directory is the first match for a desired 
import on sys.path (or within a parent package __path__), that 
namespace package directory's contents are effectively merged with 
those of all subsequent namespace package directories within the 
path, to form the common package contents.  Normal package 
directories or modules with the same name are ignored.

* If a module or normal (non-namespace) package directory with an 
__init__.py (and no .ns file(s)) are encountered first, importing 
proceeds normally.

* PEP 302 importers wishing to support namespace package directories 
should implement a 'namespace_subpath(fullname)' method, that returns 
either a __path__ entry to be used for the named package, or None if 
the package does not have a namespace directory present.

* The import machinery calls namespace_subpath() on an importer prior 
to calling find_module(), and then handles creating a namespace 
package module and loading the first __init__ submodule into it, with 
__path__ pre-initialized.

* For implementation efficiency, an importer is allowed to cache 
information (such as whether a directory exists and whether an 
__init__ module is present in it) between the invocation of a 
namespace_subpath() call and an immediately-subsequent find_module() 
call for the same name.  It should, however, avoid retaining such 
cached information for any longer than the next method call, and 
verify that the request is in fact for the same module/package name.

Thoughts?