From ncoghlan at gmail.com  Wed Nov  2 01:41:31 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 2 Nov 2011 10:41:31 +1000
Subject: [Import-SIG] PEP 382 update
In-Reply-To: <20111019221650.2f54058d@resist.wooz.org>
References: <4E9F34C8.6040308@v.loewis.de>
	<20111019221650.2f54058d@resist.wooz.org>
Message-ID: <CADiSq7fZrruGZTTZybWnfR4kiW4mFvVaMSs_cmC9VbpbSM4LSQ@mail.gmail.com>

On Thu, Oct 20, 2011 at 12:16 PM, Barry Warsaw <barry at python.org> wrote:
> I vaguely recall that something similar has been discussed on the mailing list
> before, but that there were problems with directory name markers. ?I could be
> misremembering, and will try to find details in my archives.

My recollection is similar, but I think the latest version may address
those objections by allowing the existing "package/__init__.py"
convention to still indicate a package directory *as well as* the new
"package.pyp" convention.

> Eric did remark last night that while PEP 402 is broader in scope, and *seems*
> useful, we really don't know what it will break. ?Still, we need to get this
> feature moving again.

However, what if PEP 402 was *also* rewritten to only look at
directories named "package.pyp" rather than "package" when building
the virtual package paths? It still has a more coherent story for
handling namespace package initialisation than PEP 382, *without*
slowing down existing module and package imports.

In the latest PEP 382 update, the PEP proposes that finding a
package/__init__.py file *not stop the sys.path scan* (it's actually
inconsistent currently, but that behaviour is what the latest
additions describe). That means all imports of packages get slower
since the whole sys.path is always scanned in order to populate
__path__.

I believe the PEP 402 approach is much cleaner: both foo.py and
foo/__init__.py would stop the sys.path scan *immediately* (thus
eliminating any performance impact on existing imports), but
subpackage imports ("import foo.bar" or "from foo import bar") will
attempt to either convert an existing "foo" module into a package, or
else create a new package, by scanning the whole of sys.path for
"foo.pyp" directories. The "foo/__init__.py" approach would create
self-contained packages that are explicitly closed to extension.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Wed Nov  2 05:42:41 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 Nov 2011 05:42:41 +0100
Subject: [Import-SIG] PEP 382 update
In-Reply-To: <CADiSq7fZrruGZTTZybWnfR4kiW4mFvVaMSs_cmC9VbpbSM4LSQ@mail.gmail.com>
References: <4E9F34C8.6040308@v.loewis.de>	<20111019221650.2f54058d@resist.wooz.org>
	<CADiSq7fZrruGZTTZybWnfR4kiW4mFvVaMSs_cmC9VbpbSM4LSQ@mail.gmail.com>
Message-ID: <4EB0CA41.5060009@v.loewis.de>

> I believe the PEP 402 approach is much cleaner: both foo.py and
> foo/__init__.py would stop the sys.path scan *immediately* (thus
> eliminating any performance impact on existing imports), but
> subpackage imports ("import foo.bar" or "from foo import bar") will
> attempt to either convert an existing "foo" module into a package, or
> else create a new package, by scanning the whole of sys.path for
> "foo.pyp" directories. The "foo/__init__.py" approach would create
> self-contained packages that are explicitly closed to extension.

I think that's under-specified in PEP 402. It doesn't classify
"from foo import bar" as a subpackage import, since it states that
this magic only applies to imports involving dotted names. So which
imports trigger this path scanning exactly remains to be specified.

Performance-wise, I would expect that PEP 382 is more efficient
if the package has code in it, and not worse for "pure"
namespace packages. If there is code in the package, with PEP 402,
you would have to provide a P.py file, and multiple P.pyp directories.
On importing P, it searches the path finding P.py. On importing
a sub-package, it searches the path *again* to establish the package's
__path__. With PEP 382, there is only a single run over the path.
PEP 402 might be more efficient for P/__init__.py packages.

I'm skeptical that it matters much: the majority of stat calls
comes from the many forms of module files, which neither PEP 382
nor PEP 402 would stat after the first hit. PEP 382 might be slightly
more efficient here, since a .pyp directory early on the path would
already cancel stat calls for module files (.py, .pyc, .pyd, module.pyd,
...). For a namespace package, PEP 402 would scan the
entire path for all kinds of modules, even if it eventually turns out
that it is only going to use the .pyp directories it has already seen.

Regards,
Martin

From martin at v.loewis.de  Wed Nov  2 10:24:24 2011
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 Nov 2011 10:24:24 +0100
Subject: [Import-SIG] PEP 382 specification and implementation complete
Message-ID: <4EB10C48.6010202@v.loewis.de>

I have now written an implementation of PEP 382, and fixed some details
of the PEP in the process. The implementation is available at

http://hg.python.org/features/pep-382-2/

In the new form, the PEP was much easier to implement than in the first
version (plus I understand import.c better now).

This implementation now features .pyp directories, zipimporter support,
documentation and test cases.

As the next step, I'd like to advance this to ask for pronouncement.

Regards,
Martin

From pje at telecommunity.com  Wed Nov  2 18:59:10 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 2 Nov 2011 13:59:10 -0400
Subject: [Import-SIG] PEP 382 update
In-Reply-To: <4E9F34C8.6040308@v.loewis.de>
References: <4E9F34C8.6040308@v.loewis.de>
Message-ID: <CALeMXf7BhEB+SGRV3q_9T=T9N48cV1UWTp466z5GPaMj45pxKA@mail.gmail.com>

On Wed, Oct 19, 2011 at 4:36 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> In comparison with PEP 402, after my PyCon DE presentation, people
> discussed that they prefer if Python packages require some kind of
> explicit declaration - even though Java seems to have done well with
> packages being just directories with the package name. In particular,
> a Jython guy observed that they would likely have issues with an
> approach where a directory P would already be part of a package P,
> since they often have directories in Jython that have the name of
> Python packages, but are not meant as such.
>

Unless those directories contain things which are importable, and someone
actually imports them, PEP 402 does not treat them as a package.  So, I
suspect some confusion may have occurred, especially if this was a first
exposure to the idea, rather than people actually reading the PEP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111102/b4f8b32c/attachment.html>

From martin at v.loewis.de  Wed Nov  2 19:54:37 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 Nov 2011 19:54:37 +0100
Subject: [Import-SIG] PEP 382 update
In-Reply-To: <CALeMXf7BhEB+SGRV3q_9T=T9N48cV1UWTp466z5GPaMj45pxKA@mail.gmail.com>
References: <4E9F34C8.6040308@v.loewis.de>
	<CALeMXf7BhEB+SGRV3q_9T=T9N48cV1UWTp466z5GPaMj45pxKA@mail.gmail.com>
Message-ID: <4EB191ED.1090103@v.loewis.de>

Am 02.11.2011 18:59, schrieb PJ Eby:
> On Wed, Oct 19, 2011 at 4:36 PM, "Martin v. L?wis" <martin at v.loewis.de
> <mailto:martin at v.loewis.de>> wrote:
> 
>     In comparison with PEP 402, after my PyCon DE presentation, people
>     discussed that they prefer if Python packages require some kind of
>     explicit declaration - even though Java seems to have done well with
>     packages being just directories with the package name. In particular,
>     a Jython guy observed that they would likely have issues with an
>     approach where a directory P would already be part of a package P,
>     since they often have directories in Jython that have the name of
>     Python packages, but are not meant as such.
> 
> 
> Unless those directories contain things which are importable, and
> someone actually imports them, PEP 402 does not treat them as a package.
>  So, I suspect some confusion may have occurred, especially if this was
> a first exposure to the idea, rather than people actually reading the PEP.

To the people, it doesn't really matter whether the directory would be
considered as belonging to the package or not. It's more the feeling
of properness that gets violated by not having to declare a package
directory.

In the specific case of Jython, it may be that Jython is willing to
treat Java class files as Python modules ("extension" modules); if
PEP 402 is accepted and Jython implements it, they might indeed have
an issue with directories unexpectedly containing things which are
importable. I'm not sure whether that actually is the issue, since I
didn't talk to the Jython guy further.

Regards,
Martin


From martin at v.loewis.de  Wed Nov  9 10:35:42 2011
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 09 Nov 2011 10:35:42 +0100
Subject: [Import-SIG] PEP 402: specification questions
Message-ID: <4EBA496E.2090702@v.loewis.de>

I'm trying to understand PEP 402, and have difficulties figuring
out what exactly it says. I presume that the section "Specification"
is intended to give the complete syntax and semantics of the
proposed change to Python.

A. In "Virtual Paths", it talks about obtaining importer objects
   for each path item, and then calling get_subpath on it. In the
   current implementation, not all sys.path entries correspond to
   an importer object: so what's the impact (if any) on old-style
   sys.path entries (i.e. regular directories)?

   Or, if some "builtin" importer is implied: what is its semantics
   of get_subpath for the builtin importer?

B. "Specification" starts with "importing names containing at least
   one .". That seems clear enough, however, I wonder whether

   from zope import interface

   is supported by the PEP (i.e. where zope.interface is a nested
   package, yet the names in the import don't contain a dot at all).

   I presume that the case is meant to be supported, but then I
   wonder how precisely the mechanism described in the PEP is
   triggered.

Regards,
Martin

From pje at telecommunity.com  Wed Nov  9 18:24:26 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 9 Nov 2011 12:24:26 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <4EBA496E.2090702@v.loewis.de>
References: <4EBA496E.2090702@v.loewis.de>
Message-ID: <CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>

On Wed, Nov 9, 2011 at 4:35 AM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> I'm trying to understand PEP 402, and have difficulties figuring
> out what exactly it says. I presume that the section "Specification"
> is intended to give the complete syntax and semantics of the
> proposed change to Python.
>
> A. In "Virtual Paths", it talks about obtaining importer objects
>   for each path item, and then calling get_subpath on it. In the
>   current implementation, not all sys.path entries correspond to
>   an importer object: so what's the impact (if any) on old-style
>   sys.path entries (i.e. regular directories)?
>

Sorry - that's meant to be the importer returned by pkgutil.get_importer();
it should probably be made clearer.  (IIUC, under the importlib version,
there is *always* an importer object, whether you obtain it via pkgutil or
some other means.)


  Or, if some "builtin" importer is implied: what is its semantics
>   of get_subpath for the builtin importer?
>

Those described in the rest of the PEP: i.e., if the subpath exists, return
it.  For a directory, that's os.path.isdir(os.path.join(base_path,
name_suffix)).



> B. "Specification" starts with "importing names containing at least
>   one .". That seems clear enough, however, I wonder whether
>
>   from zope import interface
>
>   is supported by the PEP (i.e. where zope.interface is a nested
>   package, yet the names in the import don't contain a dot at all).
>
>   I presume that the case is meant to be supported, but then I
>   wonder how precisely the mechanism described in the PEP is
>   triggered.
>

IIRC, "from zope import interface" does an import zope.interface
internally.  It does occur to me, however, that if it first imports zope
and tries to get an attribute of it, then that import would fail.  That
case should probably be addressed explicitly in the PEP.

I was under the impression, though, that you were wanting to do a revised
PEP 382 instead?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111109/44e223a5/attachment.html>

From eric at trueblade.com  Wed Nov  9 19:12:44 2011
From: eric at trueblade.com (Eric V. Smith)
Date: Wed, 09 Nov 2011 13:12:44 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
Message-ID: <4EBAC29C.4000502@trueblade.com>

On 11/09/2011 12:24 PM, PJ Eby wrote:
> I was under the impression, though, that you were wanting to do a
> revised PEP 382 instead?

I think the point is to understand PEP 402 well enough so that we can
choose between them.

Eric.


From martin at v.loewis.de  Wed Nov  9 21:49:27 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 09 Nov 2011 21:49:27 +0100
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
Message-ID: <4EBAE757.8030907@v.loewis.de>

> I was under the impression, though, that you were wanting to do a
> revised PEP 382 instead?

Responding to this first (I need to study your technical answers in
detail later):

I'm not quite sure how to proceed from here. I could well imagine
merging the two PEPs somehow (and would prefer if they merge into
PEP 382 as I'm more familiar with that). If that sounds reasonable
to you, feel free to propose any changes that you think should be
made to PEP 382.

ISTM that the two PEPs give opposing answers to some questions,
which ultimately requires somebody to make a decision. I'm not sure
which of these differences you consider fundamental, and which
arbitrary. To give some examples:
- what constitutes a package on disk?
- what's the impact of this new feature on existing P/__init__.py
  packages?
- what's the impact on existing modules P.py?
- when exactly is the path scan performed?

It might also be that you worked on PEP 402 only because PEP 382
appeared stalled (which it was for some time). If you are happy
with PEP 382 in its current form, you might want to withdraw PEP 402.

Regards,
Martin



From pje at telecommunity.com  Wed Nov  9 23:13:32 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 9 Nov 2011 17:13:32 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <4EBAE757.8030907@v.loewis.de>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
Message-ID: <CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>

On Wed, Nov 9, 2011 at 3:49 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> ISTM that the two PEPs give opposing answers to some questions,
> which ultimately requires somebody to make a decision. I'm not sure
> which of these differences you consider fundamental, and which
> arbitrary. To give some examples:
> - what constitutes a package on disk?
> - what's the impact of this new feature on existing P/__init__.py
>  packages?
> - what's the impact on existing modules P.py?
> - when exactly is the path scan performed?
>
> It might also be that you worked on PEP 402 only because PEP 382
> appeared stalled (which it was for some time).


If you review the Import-SIG traffic from that time period, you'll notice
that I first attempted to revise PEP 382 to address various issues --
mostly having to do with clarity and ease of backported implementations for
2.x.  As the work went on, it eventually became clear that the reason the
terminology was complicated and the spec difficult to clarify (not just for
me but for other import-sig participants) was because Python's fundamental
notion of packages was flawed, and that what Guido previously tried to do
with getting rid of the need for __init__.py (see the references in PEP
402) was a more Pythonic approach (as well as being more familiar to users
of other languages).

So, the goal for 402 was to make __init__.py ("self-contained" packages)
the special case, rather than namespace packages, and achieve a more
natural fit and ease overall.

The use of .pyp extensions doesn't really fit well with that approach,
though.  It means, for example, that you have to use ugly paths (e.g.
zope.pyp/interface.pyp/foo.py), and you have a less orthogonal path for
switching between package types.

That is, under 402, you can make a module a package just by adding a
directory.  And you can make a self-contained package into an open package
(or vice versa) by adding or deleting packagename/__init__.py or moving it
to packagename.py.

In other words, the intention of PEP 402 is to have a uniform and simple
way to evolve packages that as a side-effect allows both traditional and
"namespace" packages to work.  It implements namespace packages by
*removing* something (i.e., getting rid of __init__.py) rather than by
adding something new (e.g. .pyp extensions).  For that reason, I think it's
better for the future of the language.



> If you are happy
> with PEP 382 in its current form, you might want to withdraw PEP 402.
>

Not really.  I think that PEP 402 is approximately how Python packages
should have worked all along, and that this is a good opportunity to
rectify the current situation.  While some projects may run into issues
with files becoming importable that previously weren't, any code that was
trying to import those modules is already broken.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111109/b2583915/attachment.html>

From barry at python.org  Wed Nov  9 23:57:55 2011
From: barry at python.org (Barry Warsaw)
Date: Wed, 9 Nov 2011 17:57:55 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
Message-ID: <20111109175755.639f811e@resist.wooz.org>

On Nov 09, 2011, at 05:13 PM, PJ Eby wrote:

>In other words, the intention of PEP 402 is to have a uniform and simple
>way to evolve packages that as a side-effect allows both traditional and
>"namespace" packages to work.  It implements namespace packages by
>*removing* something (i.e., getting rid of __init__.py) rather than by
>adding something new (e.g. .pyp extensions).  For that reason, I think it's
>better for the future of the language.

That's one thing that appeals to me as a distro packager about PEP 402.  Under
PEP 402, it seems like it would be less work to modify a set of upstream
packages to eliminate the collisions on __init__.py.

-Barry

From ncoghlan at gmail.com  Thu Nov 10 00:36:26 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2011 09:36:26 +1000
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <20111109175755.639f811e@resist.wooz.org>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
Message-ID: <CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>

On Thu, Nov 10, 2011 at 8:57 AM, Barry Warsaw <barry at python.org> wrote:
> On Nov 09, 2011, at 05:13 PM, PJ Eby wrote:
>
>>In other words, the intention of PEP 402 is to have a uniform and simple
>>way to evolve packages that as a side-effect allows both traditional and
>>"namespace" packages to work. ?It implements namespace packages by
>>*removing* something (i.e., getting rid of __init__.py) rather than by
>>adding something new (e.g. .pyp extensions). ?For that reason, I think it's
>>better for the future of the language.
>
> That's one thing that appeals to me as a distro packager about PEP 402. ?Under
> PEP 402, it seems like it would be less work to modify a set of upstream
> packages to eliminate the collisions on __init__.py.

Indeed, I don't see PEP 382 reducing the number of "Why doesn't my
'foo' package work?" questions from beginners either, since it just
replaces "add an __init__.py" with "change your directory name to
'foo.pyp'". PEP 402, by contrast, should *just work* in the most
natural way possible. Similarly, "fixing" packaging conflicts just
becomes a matter of making sure that *none* of the distro packages
involved install an __init__.py file. By contrast, PEP 382 requires
that *all* of the distro packages be updated to install to "foo.pyp"
directories instead of "foo" directories.

On the other hand, the Zen does say "Explicit is better than implicit"
and if we don't allow arbitrary files without an extension as modules,
why should we allow arbitrary directories as packages*? From that
point of view, PEP 382 is actually just bringing packages into the
same extension-based regime that we already use for distinguishing
other module types.

*This is a deliberate mischaracterisation of PEP 402, but it seems to
be a common misperception that is distorting people's reactions to the
proposal - 'marker files' actually still exist in that PEP, it's just
that their definition is "any valid Python module file or a relevant
subdirectory containing such files". If this causes problems for
Jython, then they should be able to fix it the same way CPython fixed
the DLL naming conflict problem on Windows: by *not* accepting
standard Java extensions like ".jar" and ".java" as Jython modules,
and instead requiring a Jython specific extension (e.g. ".pyj",
similar to the ".pyd" CPython uses for Windows DLLs).

While there's no reference implementation for PEP 402 that updates the
standard import machinery as yet, it's worth taking a look at Greg
Slodkowic's importlib-based implementation that came out of GSoC this
year: https://bitbucket.org/jergosh/pep-402

So yeah, I still think PEP 402 is the right answer and am -1 on PEP
382 as a result - while I think PEP 382 *is* an improvement over the
status quo, I also thing it represents an unnecessary detour relative
to where I'd like to see the import system going.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Thu Nov 10 05:54:57 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2011 05:54:57 +0100
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <20111109175755.639f811e@resist.wooz.org>
References: <4EBA496E.2090702@v.loewis.de>	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>	<4EBAE757.8030907@v.loewis.de>	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
Message-ID: <4EBB5921.3020208@v.loewis.de>

Am 09.11.2011 23:57, schrieb Barry Warsaw:
> On Nov 09, 2011, at 05:13 PM, PJ Eby wrote:
> 
>> In other words, the intention of PEP 402 is to have a uniform and simple
>> way to evolve packages that as a side-effect allows both traditional and
>> "namespace" packages to work.  It implements namespace packages by
>> *removing* something (i.e., getting rid of __init__.py) rather than by
>> adding something new (e.g. .pyp extensions).  For that reason, I think it's
>> better for the future of the language.
> 
> That's one thing that appeals to me as a distro packager about PEP 402.  Under
> PEP 402, it seems like it would be less work to modify a set of upstream
> packages to eliminate the collisions on __init__.py.

I think this impression is incorrect. Assuming we are talking about
existing packages here that use the existing setuptools namespace
mechanism and which have been ported to Python 3 already, then no
change to the package should be necessary at all to support PEP 382.

Instead, setuptools/distribute should implement the namespace_packages
parameter of setup.py in such a way that it
a) drops the __init__.py from the sources, as that should contain
   something like
   __import__('pkg_resources').declare_namespace(__name__)
b) copies the files into a P.pyp folder rather than a P folder
   on build_py.

Such a change would be necessary/possible with either PEP 382 or PEP
402, so it seems to make no difference.

Regards,
Martin


From martin at v.loewis.de  Thu Nov 10 06:32:59 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2011 06:32:59 +0100
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>	<4EBAE757.8030907@v.loewis.de>	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>	<20111109175755.639f811e@resist.wooz.org>
	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
Message-ID: <4EBB620B.4030301@v.loewis.de>

> Indeed, I don't see PEP 382 reducing the number of "Why doesn't my
> 'foo' package work?" questions from beginners either

Do beginners really have that question (i.e. can you kindly point
me to archived examples of that question)? I'd expect beginners to
do whatever tutorials and examples tell them to do. If these refer
to P.pyp folders, beginners will just take that as given, and copy
it.

> since it just replaces "add an __init__.py" with "change your
directory name to
> 'foo.pyp'".

Why should users have to *change* the directory name? When they
create a package, they would create the .pyp directory to begin
with, so no need to change it.

> PEP 402, by contrast, should *just work* in the most
> natural way possible. Similarly, "fixing" packaging conflicts just
> becomes a matter of making sure that *none* of the distro packages
> involved install an __init__.py file. By contrast, PEP 382 requires
> that *all* of the distro packages be updated to install to "foo.pyp"
> directories instead of "foo" directories.

See my message to Barry: setuptools/distribute could do that, with
no change to the source tree.

When people are ready to give up pre-3.3 support, they would likely
have to modify *all* distro packages either way: with PEP 402, they
would need to drop the __init__.py from the sources, and with PEP 382,
they would additionally need to rename the directory.

However, with PEP 382, they don't *have* to do that: some portions
of a namespace package may keep the __init__.py, others may drop it,
and it would still form a single namespace. OTOH, with PEP 402,
*all* portions of the namespace would have simultaneously to agree
to use the PEP 402 mechanism, since that mechanism will be ineffective
if there is an __init__.py.

> While there's no reference implementation for PEP 402 that updates the
> standard import machinery as yet, it's worth taking a look at Greg
> Slodkowic's importlib-based implementation that came out of GSoC this
> year: https://bitbucket.org/jergosh/pep-402

Not sure whether that's the right way to use it, but I tried
setting builtins.__import__ = importlib.__import__. Then,
with "foo/bar.py" on disk, "import foo.bar" works fine. Interestingly,
"import foo" fails afterwards, even though "foo" is in sys.path.
Consequently, "from foo import bar" also fails, just as Phillip
predicted. I presume that will have to be fixed in the PEP and
the implementation.

Regards,
Martin


From ncoghlan at gmail.com  Thu Nov 10 07:31:08 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2011 16:31:08 +1000
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <4EBB620B.4030301@v.loewis.de>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
	<4EBB620B.4030301@v.loewis.de>
Message-ID: <CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>

On Thu, Nov 10, 2011 at 3:32 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Indeed, I don't see PEP 382 reducing the number of "Why doesn't my
>> 'foo' package work?" questions from beginners either
>
> Do beginners really have that question (i.e. can you kindly point
> me to archived examples of that question)? I'd expect beginners to
> do whatever tutorials and examples tell them to do. If these refer
> to P.pyp folders, beginners will just take that as given, and copy
> it.

Yep, beginners do ask that question, particularly if they have
experience with other languages that don't require explicit markers
for package directories (hence the tone of PEP 402). I mostly saw this
when I was following the Stack Overflow python RSS feed - here's one
such example: http://stackoverflow.com/questions/456481/cant-get-python-to-import-from-a-different-folder

>> since it just replaces "add an __init__.py" with "change your
> directory name to
>> 'foo.pyp'".
>
> Why should users have to *change* the directory name? When they
> create a package, they would create the .pyp directory to begin
> with, so no need to change it.

Because they start by doing the wrong thing, and then go to places
like SO to ask why it doesn't work and are told how to fix it. With
PEP 402, they wouldn't have to be told how to fix it because it would
just work the way they expected.

>> PEP 402, by contrast, should *just work* in the most
>> natural way possible. Similarly, "fixing" packaging conflicts just
>> becomes a matter of making sure that *none* of the distro packages
>> involved install an __init__.py file. By contrast, PEP 382 requires
>> that *all* of the distro packages be updated to install to "foo.pyp"
>> directories instead of "foo" directories.
>
> See my message to Barry: setuptools/distribute could do that, with
> no change to the source tree.

It still rings alarm bells for me - there's a non-trivial transform
going on between what's in the source tree and what's expected on
deployment with that approach, and that's going to break a lot of
things. (e.g. symlinking source checkouts into place in order to
pretend that plugins are installed)

> However, with PEP 382, they don't *have* to do that: some portions
> of a namespace package may keep the __init__.py, others may drop it,
> and it would still form a single namespace. OTOH, with PEP 402,
> *all* portions of the namespace would have simultaneously to agree
> to use the PEP 402 mechanism, since that mechanism will be ineffective
> if there is an __init__.py.

That's a pretty good argument in favour of dropping the
"self-contained package" concept from PEP 402, but aside from that
aspect, it doesn't help decide between the two.

>> While there's no reference implementation for PEP 402 that updates the
>> standard import machinery as yet, it's worth taking a look at Greg
>> Slodkowic's importlib-based implementation that came out of GSoC this
>> year: https://bitbucket.org/jergosh/pep-402
>
> Not sure whether that's the right way to use it, but I tried
> setting builtins.__import__ = importlib.__import__. Then,
> with "foo/bar.py" on disk, "import foo.bar" works fine. Interestingly,
> "import foo" fails afterwards, even though "foo" is in sys.path.
> Consequently, "from foo import bar" also fails, just as Phillip
> predicted. I presume that will have to be fixed in the PEP and
> the implementation.

Yeah, I haven't actually had a chance to try it out yet. It sounds
like it's just an implementation bug in the "import foo" part, though,
since PJE is correct in his recollection that the "from x import y"
algorithm is along the lines of:

    import x
    if hasattr(x, 'y'):
        return x.y
    else:
        import x.y
        return x.y

(It doesn't *quite* work that way, but that's the gist of it:
http://hg.python.org/cpython/file/default/Python/import.c#l3171)

Cheers,
Nick.


-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu Nov 10 07:53:21 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2011 16:53:21 +1000
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
	<4EBB620B.4030301@v.loewis.de>
	<CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
Message-ID: <CADiSq7d5qwM5QhbxrHtpiDQvkevOKKzKDpYtMVreWcwKq8pjSg@mail.gmail.com>

On Thu, Nov 10, 2011 at 4:31 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Nov 10, 2011 at 3:32 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> However, with PEP 382, they don't *have* to do that: some portions
>> of a namespace package may keep the __init__.py, others may drop it,
>> and it would still form a single namespace. OTOH, with PEP 402,
>> *all* portions of the namespace would have simultaneously to agree
>> to use the PEP 402 mechanism, since that mechanism will be ineffective
>> if there is an __init__.py.
>
> That's a pretty good argument in favour of dropping the
> "self-contained package" concept from PEP 402, but aside from that
> aspect, it doesn't help decide between the two.

Actually, scratch that part of my response. *Existing* namespace
packages that work properly already have a single owner - the one that
creates the __init__.py file and sets up the namespace extension
mechanisms.  They're forced to work that way due to the file collision
problem.

With PEP 402, those owning packages are the only ones that would have
to change. With PEP 382, all the *other* distro packages have to
change as well (either directly, or via the packaging utilities
modifying path names on installation - in which case, good luck
running any affected code from an uninstalled source tree).

It seems I now remember at least some of the reasons why we didn't
like the "directory extension" idea the first time it was suggested :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Thu Nov 10 19:03:13 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2011 19:03:13 +0100
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CADiSq7d5qwM5QhbxrHtpiDQvkevOKKzKDpYtMVreWcwKq8pjSg@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>	<4EBAE757.8030907@v.loewis.de>	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>	<20111109175755.639f811e@resist.wooz.org>	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>	<4EBB620B.4030301@v.loewis.de>	<CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
	<CADiSq7d5qwM5QhbxrHtpiDQvkevOKKzKDpYtMVreWcwKq8pjSg@mail.gmail.com>
Message-ID: <4EBC11E1.80409@v.loewis.de>

> Actually, scratch that part of my response. *Existing* namespace
> packages that work properly already have a single owner

How so? The zope package certainly doesn't have a single owner. Instead,
it's spread over a large number of subpackages.

> With PEP 402, those owning packages are the only ones that would have
> to change.

No. In setuptools namespace packages, each portion of the namespace
(i.e. each distribution) will have it's own __init__.py; which of them
gets actually used is arbitrary but also irrelevant since they all look
the same.

So "only those" is actually "all of them".

> With PEP 382, all the *other* distro packages have to
> change as well

What's a "distro package"? Which are the other ones?

They don't need to change at all. The existing setuptools namespace
mechanism will continue to work, and you can add PEP 382 package
portions freely.

> It seems I now remember at least some of the reasons why we didn't
> like the "directory extension" idea the first time it was suggested :)

Please elaborate - I missed your point.

Regards,
Martin

From martin at v.loewis.de  Thu Nov 10 19:14:51 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2011 19:14:51 +0100
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
References: <4EBA496E.2090702@v.loewis.de>	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>	<4EBAE757.8030907@v.loewis.de>	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>	<20111109175755.639f811e@resist.wooz.org>	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>	<4EBB620B.4030301@v.loewis.de>
	<CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
Message-ID: <4EBC149B.8090405@v.loewis.de>

> Yeah, I haven't actually had a chance to try it out yet. It sounds
> like it's just an implementation bug in the "import foo" part, though,
> since PJE is correct in his recollection that the "from x import y"
> algorithm is along the lines of:
> 
>     import x

That already fails with PEP 402: you can't import the virtual package,
only the subpackages. It then just stops with that import failed, and
doesn't even try to look for subpackages. So specification and
implementation really match here - it's not just an implementation
bug.

Regards,
Martin

From pje at telecommunity.com  Thu Nov 10 19:25:39 2011
From: pje at telecommunity.com (PJ Eby)
Date: Thu, 10 Nov 2011 13:25:39 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <4EBC11E1.80409@v.loewis.de>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
	<4EBB620B.4030301@v.loewis.de>
	<CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
	<CADiSq7d5qwM5QhbxrHtpiDQvkevOKKzKDpYtMVreWcwKq8pjSg@mail.gmail.com>
	<4EBC11E1.80409@v.loewis.de>
Message-ID: <CALeMXf6RQESy=udt92ze8=f5rwyB4SwdWvjmXTuT8ZvF2JXSQQ@mail.gmail.com>

On Thu, Nov 10, 2011 at 1:03 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> > Actually, scratch that part of my response. *Existing* namespace
> > packages that work properly already have a single owner
>
> How so? The zope package certainly doesn't have a single owner. Instead,
> it's spread over a large number of subpackages.
>

In distro packages (i.e. "system packages") there may be a
namespace-defining package that provides an __init__.py.  For example, I
believe Debian (system) packages peak.util this way, even though there are
many separately distributed peak.util.* (python) packages.


> With PEP 402, those owning packages are the only ones that would have
> > to change.
>
> No. In setuptools namespace packages, each portion of the namespace
> (i.e. each distribution) will have it's own __init__.py; which of them
> gets actually used is arbitrary but also irrelevant since they all look
> the same.
>
> So "only those" is actually "all of them".
>

Nick is speaking again about system packages released by OS distributors.
 A naive system package built with setuptools of a namespace package will
not contain an __init__.py, but only a .nspkg.pth file used to make the
__init__.py unnecessary.

(In this sense, the existing setuptools namespace package implementation
 for system-installed packages is actually a primitive partial
implementation of PEP 402.)

In summary: some system packages are built with an owning package, some
aren't.  Those with an owning package will need to drop the __init__.py
(from that one package), and the others do not, because they don't have an
__init__.py.  In either case, PEP 402 leaves the directory layout alone.  A
version of setuptools intended for PEP 402 support would drop the nspkg.pth
inclusion, and a version of "packaging" intended for PEP 402 would simply
not add one.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111110/cd113ddc/attachment.html>

From pje at telecommunity.com  Thu Nov 10 19:28:22 2011
From: pje at telecommunity.com (PJ Eby)
Date: Thu, 10 Nov 2011 13:28:22 -0500
Subject: [Import-SIG] PEP 402: specification questions
In-Reply-To: <4EBC149B.8090405@v.loewis.de>
References: <4EBA496E.2090702@v.loewis.de>
	<CALeMXf40v3oK6qqyzPWyd8e3U_9-KxU3tr-hgZDNn5=xxmwmDw@mail.gmail.com>
	<4EBAE757.8030907@v.loewis.de>
	<CALeMXf7NMUG+hxYPUi-b2czz85GndL-XURMU8vMPf8H-JwxzPA@mail.gmail.com>
	<20111109175755.639f811e@resist.wooz.org>
	<CADiSq7fy8=EBeGt7ccank-pecKY8=TFBchM3R_QH0SXskvaHGg@mail.gmail.com>
	<4EBB620B.4030301@v.loewis.de>
	<CADiSq7cELur2DssDVs8AQnArHiyVfgxmSL0pnJb513+qV2wpeQ@mail.gmail.com>
	<4EBC149B.8090405@v.loewis.de>
Message-ID: <CALeMXf7marmeJepJCT9vuTpewuBjnHkhopxb2ChGbf0a6nBHQg@mail.gmail.com>

On Thu, Nov 10, 2011 at 1:14 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> That already fails with PEP 402: you can't import the virtual package,
> only the subpackages. It then just stops with that import failed, and
> doesn't even try to look for subpackages. So specification and
> implementation really match here - it's not just an implementation
> bug.
>

Right - you found a bug in the spec, in that it should either explicitly
amend the from-import algorithm to treat the first import the same as an
AttributeError, or else say, "use 'import zope.interface as interface'
instead".
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111110/c315a115/attachment.html>

From ncoghlan at gmail.com  Sun Nov 13 06:12:31 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 13 Nov 2011 15:12:31 +1000
Subject: [Import-SIG] Import engine PEP up on python.org
Message-ID: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>

I finally got around to updating the import engine draft PEP and
publishing it on python.org: http://www.python.org/dev/peps/pep-0406/

I think this is a direction we want to move eventually, but I'm not in
any great hurry (in particular, I don't believe any effort should be
expended on an import.c based version, so bootstrapping importlib as
the standard import mechanism is a blocking dependency*). If it
doesn't make 3.3 (and there's a fair chance it won't, since I have a
few other things to work on that I think will benefit more people in
the near term), then 3.4 isn't that far away.

*Brett: do you have a public Hg repo for working on the importlib
bootstrapping effort?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Sun Nov 13 09:28:22 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2011 09:28:22 +0100
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
Message-ID: <4EBF7FA6.3000007@v.loewis.de>

Am 13.11.2011 06:12, schrieb Nick Coghlan:
> I finally got around to updating the import engine draft PEP and
> publishing it on python.org: http://www.python.org/dev/peps/pep-0406/

I think the rationale section needs to be improved. In fact, I still
don't understand what the objective of this API is (I do understand what
it achieves, but it's unclear why having that is desirable, and for what
applications).

I notice that there is overlap both with multiple subinterpreters,
and with restricted execution. It hints at providing both, but actually
provides neither.

I think the long-term solution really should be proper support for
subinterpreters, where there is no global state in C at all. Extension
modules already can achieve this through the PEP 3121 API (even though
few modules actually do).

If the objective is to have more of the import machinery implemented in
Python, then making importlib the import machinery might be best.

If the objective is to allow hooks into the import procedure, it would
be best to just provide the hooks. OTOH, PEP 302 already defined hooks,
and it seems that people are happy with these.

Regards,
Martin

From ncoghlan at gmail.com  Sun Nov 13 11:21:09 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 13 Nov 2011 20:21:09 +1000
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <4EBF7FA6.3000007@v.loewis.de>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<4EBF7FA6.3000007@v.loewis.de>
Message-ID: <CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>

On Sun, Nov 13, 2011 at 6:28 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Am 13.11.2011 06:12, schrieb Nick Coghlan:
>> I finally got around to updating the import engine draft PEP and
>> publishing it on python.org: http://www.python.org/dev/peps/pep-0406/
>
> I think the rationale section needs to be improved. In fact, I still
> don't understand what the objective of this API is (I do understand what
> it achieves, but it's unclear why having that is desirable, and for what
> applications).

It's desirable for the same reason *any* form of object-oriented
encapsulation is desirable: because it makes it easier to *think*
about the problem and manage interdependencies between state updates.
I didn't realise the merits of OO designs needed to be justified - I
figured the list of 6 pieces of interdependent process global state
spoke for itself.

> I notice that there is overlap both with multiple subinterpreters,
> and with restricted execution. It hints at providing both, but actually
> provides neither.

It doesn't claim to provide either - it's sole aim is to provide a
relatively lightweight mechanism to selectively adjust elements of the
import system (e.g. adding to sys.path when importing plugins, but
leaving it alone otherwise).

But having the import state better encapsulated would make it easier
to improve the isolation of subinterpreters so that they aren't
sharing Python modules, even if they still share extension modules
(you could put a single pointer to the import engine on the
interpreter state rather than storing it in sys the way we do now).

> I think the long-term solution really should be proper support for
> subinterpreters, where there is no global state in C at all. Extension
> modules already can achieve this through the PEP 3121 API (even though
> few modules actually do).
>
> If the objective is to have more of the import machinery implemented in
> Python, then making importlib the import machinery might be best.

Guido already approved (in principle) that change - this PEP would
actually depend on that being done first (because I think trying to
build this API directly on top of import.c would be a complete waste
of time).

> If the objective is to allow hooks into the import procedure, it would
> be best to just provide the hooks. OTOH, PEP 302 already defined hooks,
> and it seems that people are happy with these.

No, they're not. Yes, the hooks are *usable*, but they're damn hard to
comprehend. When even the *experts* hate messing with a subsystem,
it's a hint that there's something wrong with the way it is set up. In
this case, I firmly believe a big part of the problem is that the
import system is a complex, interdependent mechanism, but there's no
coherence to the architecture. It's as if the whole thing was written
in C from an architectural point of view, but without even bothering
to create a dedicated structure to hold the state.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Sun Nov 13 12:36:52 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2011 12:36:52 +0100
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>	<4EBF7FA6.3000007@v.loewis.de>
	<CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
Message-ID: <4EBFABD4.1030207@v.loewis.de>

> It's desirable for the same reason *any* form of object-oriented
> encapsulation is desirable: because it makes it easier to *think*
> about the problem and manage interdependencies between state updates.

I guess I'm -1 on that PEP then. If it introduces complications just
for the sake of some presumed simplification, it's not worth it.

> I didn't realise the merits of OO designs needed to be justified - I
> figured the list of 6 pieces of interdependent process global state
> spoke for itself.

Perhaps I'm challenging the specific choice of classes then: I would
find it completely reasonable to move all of this into the interpreter
state, as I think it's fine that this "global" state is unique to the
interpreter. There is only a single __import__ builtin, and the
objective of the import statement is to make a change to the "global"
state (scoped with the interpreter).

>> I notice that there is overlap both with multiple subinterpreters,
>> and with restricted execution. It hints at providing both, but actually
>> provides neither.
> 
> It doesn't claim to provide either - it's sole aim is to provide a
> relatively lightweight mechanism to selectively adjust elements of the
> import system (e.g. adding to sys.path when importing plugins, but
> leaving it alone otherwise).

Ok - that might be a use case. However, I'm skeptical that this PEP is
good at achieving that objective - as you notice, there is the challenge
of recursive imports.

I would rather prefer to make such variables per-thread, or, rather
"per context". Something like

with sys.extended_path(directory):
  load_plugin()

This would extend the path for all code run within the context of the
with statement, but not elsewhere. As an implementation strategy, the
thread state would be able to override the global variables, in a
stacked (nested) manner. The exact list of variables that can be
overridden needs to be carefully considered - for example, I would
still view a single sys.modules as important in that use case.


> But having the import state better encapsulated would make it easier
> to improve the isolation of subinterpreters so that they aren't
> sharing Python modules, even if they still share extension modules

That already works, no? Subinterpreters don't share Python modules
(that's about the only feature about subinterpreters that actually
works).

> No, they're not. Yes, the hooks are *usable*, but they're damn hard to
> comprehend. When even the *experts* hate messing with a subsystem,
> it's a hint that there's something wrong with the way it is set up. In
> this case, I firmly believe a big part of the problem is that the
> import system is a complex, interdependent mechanism, but there's no
> coherence to the architecture. It's as if the whole thing was written
> in C from an architectural point of view, but without even bothering
> to create a dedicated structure to hold the state.

I agree that the import system is difficult to understand, and the
PEP 302 hooks in particular. I mightily disagree that the cause of
these difficulties is the global state used in the implementation.
It's rather the order in which things are called, and how they interact,
which makes it difficult to understand. Adding an optional keyword
argument to some of the function is surely no simplification.

Regards,
Martin

From ericsnowcurrently at gmail.com  Sun Nov 13 19:44:51 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sun, 13 Nov 2011 11:44:51 -0700
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<4EBF7FA6.3000007@v.loewis.de>
	<CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
Message-ID: <CALFfu7DTw_sKKqCmdrMQ2WmiGpOoWzPis1XTn16wKDvOjGYcBQ@mail.gmail.com>

On Nov 13, 2011 3:21 AM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> On Sun, Nov 13, 2011 at 6:28 PM, "Martin v. L?wis" <martin at v.loewis.de>
wrote:
> > Am 13.11.2011 06:12, schrieb Nick Coghlan:
> >> I finally got around to updating the import engine draft PEP and
> >> publishing it on python.org: http://www.python.org/dev/peps/pep-0406/
> >
> > I think the rationale section needs to be improved. In fact, I still
> > don't understand what the objective of this API is (I do understand what
> > it achieves, but it's unclear why having that is desirable, and for what
> > applications).
>
> It's desirable for the same reason *any* form of object-oriented
> encapsulation is desirable: because it makes it easier to *think*
> about the problem and manage interdependencies between state updates.
> I didn't realise the merits of OO designs needed to be justified - I
> figured the list of 6 pieces of interdependent process global state
> spoke for itself.
>
> > I notice that there is overlap both with multiple subinterpreters,
> > and with restricted execution. It hints at providing both, but actually
> > provides neither.
>
> It doesn't claim to provide either - it's sole aim is to provide a
> relatively lightweight mechanism to selectively adjust elements of the
> import system (e.g. adding to sys.path when importing plugins, but
> leaving it alone otherwise).
>
> But having the import state better encapsulated would make it easier
> to improve the isolation of subinterpreters so that they aren't
> sharing Python modules, even if they still share extension modules
> (you could put a single pointer to the import engine on the
> interpreter state rather than storing it in sys the way we do now).
>
> > I think the long-term solution really should be proper support for
> > subinterpreters, where there is no global state in C at all. Extension
> > modules already can achieve this through the PEP 3121 API (even though
> > few modules actually do).
> >
> > If the objective is to have more of the import machinery implemented in
> > Python, then making importlib the import machinery might be best.
>
> Guido already approved (in principle) that change - this PEP would
> actually depend on that being done first (because I think trying to
> build this API directly on top of import.c would be a complete waste
> of time).
>
> > If the objective is to allow hooks into the import procedure, it would
> > be best to just provide the hooks. OTOH, PEP 302 already defined hooks,
> > and it seems that people are happy with these.
>
> No, they're not. Yes, the hooks are *usable*, but they're damn hard to
> comprehend. When even the *experts* hate messing with a subsystem,
> it's a hint that there's something wrong with the way it is set up.

This is the big motivator for my talk proposal at the next PyCon, on
getting the most out of Python imports.  They're woefully under-utilized
IMHO, exactly because the import machinery is generally poorly understood.

-eric

> In
> this case, I firmly believe a big part of the problem is that the
> import system is a complex, interdependent mechanism, but there's no
> coherence to the architecture. It's as if the whole thing was written
> in C from an architectural point of view, but without even bothering
> to create a dedicated structure to hold the state.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111113/3470d5d6/attachment.html>

From ncoghlan at gmail.com  Mon Nov 14 02:27:11 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Nov 2011 11:27:11 +1000
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <4EBFABD4.1030207@v.loewis.de>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<4EBF7FA6.3000007@v.loewis.de>
	<CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
	<4EBFABD4.1030207@v.loewis.de>
Message-ID: <CADiSq7fL3vNH7buz8n9DtgNaDou1E2nkzKN-cF0ofTVjuDenrA@mail.gmail.com>

On Sun, Nov 13, 2011 at 9:36 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> It's desirable for the same reason *any* form of object-oriented
>> encapsulation is desirable: because it makes it easier to *think*
>> about the problem and manage interdependencies between state updates.
>
> I guess I'm -1 on that PEP then. If it introduces complications just
> for the sake of some presumed simplification, it's not worth it.

I think you're right on the PEP *as it stands* - I don't think it's an
improvement on the status quo yet. However, I think it's a useful
starting point for using the tools we have available (i.e. classes and
context managers) to make further progress in bringing the complexity
under control and making the import system as a whole less
intimidating and magical.

>> I didn't realise the merits of OO designs needed to be justified - I
>> figured the list of 6 pieces of interdependent process global state
>> spoke for itself.
>
> Perhaps I'm challenging the specific choice of classes then: I would
> find it completely reasonable to move all of this into the interpreter
> state, as I think it's fine that this "global" state is unique to the
> interpreter. There is only a single __import__ builtin, and the
> objective of the import statement is to make a change to the "global"
> state (scoped with the interpreter).

Yes, I think that's a reasonable way of looking at it. I believe
there's merit in partitioning off the import state from everything
else, but the basic idea would also be served just by moving
everything to the interpreter state without adding a new class level
to the mix (in fact, as it turns out, 'sys' as a whole is effectively
part of the interpreter state, so this is already true to some
degree).

The analogy that occurred to me after reading your reply was the past
migration from the separate "last_traceback", "last_type",
"last_value" attributes in the sys module to the consolidated triple
returned by sys.exc_info().

>>> I notice that there is overlap both with multiple subinterpreters,
>>> and with restricted execution. It hints at providing both, but actually
>>> provides neither.
>>
>> It doesn't claim to provide either - it's sole aim is to provide a
>> relatively lightweight mechanism to selectively adjust elements of the
>> import system (e.g. adding to sys.path when importing plugins, but
>> leaving it alone otherwise).
>
> Ok - that might be a use case. However, I'm skeptical that this PEP is
> good at achieving that objective - as you notice, there is the challenge
> of recursive imports.
>
> I would rather prefer to make such variables per-thread, or, rather
> "per context". Something like
>
> with sys.extended_path(directory):
> ?load_plugin()
>
> This would extend the path for all code run within the context of the
> with statement, but not elsewhere. As an implementation strategy, the
> thread state would be able to override the global variables, in a
> stacked (nested) manner. The exact list of variables that can be
> overridden needs to be carefully considered - for example, I would
> still view a single sys.modules as important in that use case.

One advantage of the OO model is that it allows such decisions to be
made on a case-by-case basis - as the PEP shows, you can use
properties to control which attributes are isolated from the process
global state and which still perform global modifications.

>> But having the import state better encapsulated would make it easier
>> to improve the isolation of subinterpreters so that they aren't
>> sharing Python modules, even if they still share extension modules
>
> That already works, no? Subinterpreters don't share Python modules
> (that's about the only feature about subinterpreters that actually
> works).

You're quite right - I forgot that the subinterpreter initialisation
already takes care to ensure that the subinterpreter gets a new copy
of the sys module, and then reinitialises the import state for that
new copy. So I guess this proposal can be seen as an intermediate
level of isolation that is accessible from Python code, without
requiring actually swapping out the interpreter state the way
Py_NewInterpreter() does.

>> No, they're not. Yes, the hooks are *usable*, but they're damn hard to
>> comprehend. When even the *experts* hate messing with a subsystem,
>> it's a hint that there's something wrong with the way it is set up. In
>> this case, I firmly believe a big part of the problem is that the
>> import system is a complex, interdependent mechanism, but there's no
>> coherence to the architecture. It's as if the whole thing was written
>> in C from an architectural point of view, but without even bothering
>> to create a dedicated structure to hold the state.
>
> I agree that the import system is difficult to understand, and the
> PEP 302 hooks in particular. I mightily disagree that the cause of
> these difficulties is the global state used in the implementation.
> It's rather the order in which things are called, and how they interact,
> which makes it difficult to understand.

I don't think there's any one thing that makes it so hard to
understand - I think it's a lot of smaller things stacking on top of
each other. Off the top of my head:
- 6 pieces of interdependent global state in 'sys' and 'imp'
- distributed state in package '__path__' attributes
- lack of builtin PEP 302 support for the standard filesystem import
mechanism (hence the undocumented emulation inside pkgutil)
- difficulty of tweaking pieces of the import algorithm while
preserving the rest without copying a lot of code
- scattered APIs (across imp, importlib and pkgutil) for correctly
handling import state updates and data driven imports

importlib is a big step forward on several of those fronts - if Brett
wants/needs help bootstrapping it in as the main import
implementation, then that's a far more important task than adding a
top-level object-oriented API.

However, a top level OO API will still be beneficial in at least a
couple of respects:
  - it becomes significantly easier to replace *elements* of the
import mechanism, beyond the hooks provided by PEP 302. Specifically,
you can *subclass* the engine implementation and only replace the
parts you want to change.
  - you can provide convenience functions that handle multi-stage
updates to the import state in a consistent fashion (cf. many of the
details in PEP 402 regarding correctly updating package paths as
sys.path changes).

> Adding an optional keyword
> argument to some of the function is surely no simplification.

Yeah, that's by far the weakest part of the idea so far - figuring out
how to integrate it with the existing PEP 302 mechanisms. As an
initial step, I'm now thinking we may be able to do something based on
context management and the import lock that is even simpler than going
to thread-local storage: offer a context manager as part of the engine
API that acquires the import lock, swaps out all of the state in
sys.modules for the engine's own state, then reverses the process at
the end. Something like:

    IMPORT_STATE_ATTRS = ("path", "modules", "path_importer_cache",
"meta_path", "path_hooks")

    @contextmanager
    def import_context(self):
        imp.acquire_lock()
        try:
            orig_state = tuple(getattr(sys, attr) for attr in
IMPORT_STATE_ATTRS)
            new_state = tuple(getattr(self, attr) for attr in
IMPORT_STATE_ATTRS)
            restore_state = []
            try:
                for attr, new_value, orig_value in zip(state_attrs,
new_state, orig_state):
                    setattr(sys, attr, new_value)
                    restore_state.append((attr, orig_value))
                yield self
            finally:
                for attr, orig_value in restore_state:
                    setattr(sys, attr, orig_value)
        finally:
            imp.release_lock()

We would have to go through the interpreter and eliminate all of the
current locations where 'sys' gets bypassed to make that work, though
(e.g. most direct access to interp->modules from C code would need to
be updated to go through 'sys' instead).

The bar for the PEP really needs to be set at "existing importers and
loaders work without modification" (so long as they're not caching sys
attributes when they really shouldn't be)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Mon Nov 14 02:46:14 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Nov 2011 11:46:14 +1000
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CADiSq7fL3vNH7buz8n9DtgNaDou1E2nkzKN-cF0ofTVjuDenrA@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<4EBF7FA6.3000007@v.loewis.de>
	<CADiSq7cMY_GCkGnBZ9CSpiSyNmneNQFB-m15jQXoOyX_L4UDFw@mail.gmail.com>
	<4EBFABD4.1030207@v.loewis.de>
	<CADiSq7fL3vNH7buz8n9DtgNaDou1E2nkzKN-cF0ofTVjuDenrA@mail.gmail.com>
Message-ID: <CADiSq7dKnMPg7CtX8xo1XZ767+dPauV6y7Z=M_F1igPK81_FMQ@mail.gmail.com>

On Mon, Nov 14, 2011 at 11:27 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> We would have to go through the interpreter and eliminate all of the
> current locations where 'sys' gets bypassed to make that work, though
> (e.g. most direct access to interp->modules from C code would need to
> be updated to go through 'sys' instead).

Alternatively, we could decide to skip supporting isolation of
sys.modules altogether in the initial incarnation - that would also
deal with the builtins and extension module problem.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From brett at python.org  Mon Nov 14 17:25:17 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 14 Nov 2011 11:25:17 -0500
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
Message-ID: <CAP1=2W48uYUYKiwqbFniL+FgvkRa=2aZzbc7PYTkXV6SQcstpQ@mail.gmail.com>

On Sun, Nov 13, 2011 at 00:12, Nick Coghlan <ncoghlan at gmail.com> wrote:
[SNIP]
>
> *Brett: do you have a public Hg repo for working on the importlib
> bootstrapping effort?
>

Yep. If you look at?http://hg.python.org/sandbox/bcannon/ you will
find a bootstrap_importlib branch. It's been about two months since I
had a chance to work on it, so it needs to be merged with default.

In the branch you will find a FAILING file which contains the test
names of tests that are failing and a comment as to why (one I have
not fully dived into and the test_pydoc failure can be fixed once
ImportError has an attribute specifying what module it couldn't
import). You can pass the file to regrtest to easily test the known
failures.

Otherwise comments on what is left to be done can be found in
Python/pythonrun.c (although the comment about zipimport is
out-of-date; I fixed that in?rev 72162:b4edd0d9fce6). At this point
all that is left (I think) is dealing with:

1. _io wanting to import os at module initialization time (I suspect
it can just be a post-importlib call in pythonrun.c to setup)
2. exposing the APIs that are added in importlib.__init__ (case_ok
from import.c and reading/writing longs from marshal; need to add a
comment to pythonrun.c about this)
3. adding a build rule to freeze importlib for importation (thinking
it might not be best to do it automatically to make it easier to fix
bugs using a known, good version of importlib, but that's still up in
the air)

IOW nothing crazy or insurmountable. I'm still hoping to be damn close
by PyCon, but who knows. I have just moved to Toronto so at least my
life should be stabilizing, but I am starting on a new team so I don't
know what kind of ramp-up that will entail. I have absolutely no
issues with receiving help on this; importlib if people want to help
(and the C code sans build stuff to get the freezing working shouldn't
be nuts, so people like Eric can help if they want =).

>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig

From ericsnowcurrently at gmail.com  Mon Nov 14 18:51:56 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 14 Nov 2011 10:51:56 -0700
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CAP1=2W48uYUYKiwqbFniL+FgvkRa=2aZzbc7PYTkXV6SQcstpQ@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<CAP1=2W48uYUYKiwqbFniL+FgvkRa=2aZzbc7PYTkXV6SQcstpQ@mail.gmail.com>
Message-ID: <CALFfu7CxYU3R9HpTDWPrO1HbTxFX+mSKPuuCndOwic-Va5yDXQ@mail.gmail.com>

On Mon, Nov 14, 2011 at 9:25 AM, Brett Cannon <brett at python.org> wrote:
> IOW nothing crazy or insurmountable. I'm still hoping to be damn close
> by PyCon, but who knows. I have just moved to Toronto so at least my
> life should be stabilizing, but I am starting on a new team so I don't
> know what kind of ramp-up that will entail. I have absolutely no
> issues with receiving help on this; importlib if people want to help
> (and the C code sans build stuff to get the freezing working shouldn't
> be nuts, so people like Eric can help if they want =).

I'd be glad to. :)  I've cloned the repo[1] so I can work on it.  Let
me know if that's a problem.  Any work I do I'll track on the existing
ticket[2].

Between the PyCon program committee and the talk proposals I have in,
I won't have time to work on this for a little while; but I agree that
it'd be good to get this done sooner, rather than later.

-eric

[1] https://bitbucket.org/ericsnowcurrently/bcannon_sandbox
[2] http://bugs.python.org/issue2377

From brett at python.org  Mon Nov 14 19:02:48 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 14 Nov 2011 13:02:48 -0500
Subject: [Import-SIG] Import engine PEP up on python.org
In-Reply-To: <CALFfu7CxYU3R9HpTDWPrO1HbTxFX+mSKPuuCndOwic-Va5yDXQ@mail.gmail.com>
References: <CADiSq7cGkVds8Z_ikw6UKFiq6kt_Z=Q3qiv1ShY2NpoQLv33yw@mail.gmail.com>
	<CAP1=2W48uYUYKiwqbFniL+FgvkRa=2aZzbc7PYTkXV6SQcstpQ@mail.gmail.com>
	<CALFfu7CxYU3R9HpTDWPrO1HbTxFX+mSKPuuCndOwic-Va5yDXQ@mail.gmail.com>
Message-ID: <CAP1=2W7M-vojW0EDT9-4Tk_bvdVjkRZSF4TDfeauoA4c5fspVQ@mail.gmail.com>

On Mon, Nov 14, 2011 at 12:51, Eric Snow <ericsnowcurrently at gmail.com>wrote:

> On Mon, Nov 14, 2011 at 9:25 AM, Brett Cannon <brett at python.org> wrote:
> > IOW nothing crazy or insurmountable. I'm still hoping to be damn close
> > by PyCon, but who knows. I have just moved to Toronto so at least my
> > life should be stabilizing, but I am starting on a new team so I don't
> > know what kind of ramp-up that will entail. I have absolutely no
> > issues with receiving help on this; importlib if people want to help
> > (and the C code sans build stuff to get the freezing working shouldn't
> > be nuts, so people like Eric can help if they want =).
>
> I'd be glad to. :)  I've cloned the repo[1] so I can work on it.  Let
> me know if that's a problem.


Not at all! Reason we moved to a DVCS was so people could do what you are
doing easily.


>  Any work I do I'll track on the existing
> ticket[2].
>

Wow, that ticket will be 4 years old come PyCon 2012. =P


>
> Between the PyCon program committee and the talk proposals I have in,
> I won't have time to work on this for a little while; but I agree that
> it'd be good to get this done sooner, rather than later.
>

No rush. =) I have had no time to do a single review for PyCon and the PC
comes first since that has a hard deadline.


>
> -eric
>
> [1] https://bitbucket.org/ericsnowcurrently/bcannon_sandbox
> [2] http://bugs.python.org/issue2377
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111114/de57d6b0/attachment.html>

From ncoghlan at gmail.com  Wed Nov 16 07:29:52 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Nov 2011 16:29:52 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
Message-ID: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>

One of the fixes PEP 395 (module aliasing) proposes is to make running
modules inside packages by filename work correctly (i.e. without
breaking relative imports and without getting the directory where the
module lives directly on sys.path which can lead to unexpected name
clashes). The PEP currently states [1] that this can be made to work
with both PEP 382 and PEP 402

In current Python, fixing this just involves checking for a colocated
__init__.py file. If we find one, then we work our way up the
directory hierarchy until we find a directory without an __init__.py
file, put *that* on sys.path, then (effectively) rewrite the command
line as if the -m switch had been used.

The extension to the current version of PEP 382 is clear - we just
accept both an __init__.py file and a .pyp extension as indicating
"this is part of a Python package", but otherwise the walk back up the
filesystem hierarchy to decide which directory to add to sys.path
remains unchanged.

However, I'm no longer convinced that this concept can actually be
made to work in the context of PEP 402:

1. We can't use sys.path, since we're trying to figure out which
directory we want to *add* to sys.path
2. We can't use "contains a Python module", since PEP 402 allows
directories inside packages that only contain subpackages (only the
leaf directories are required to contain valid Python modules), so we
don't know the significance of an empty directory without already
knowing what is on sys.path!

So, without a clear answer to the question of "from module X, inside
package (or package portion) Y, find the nearest parent directory that
should be placed on sys.path" in a PEP 402 based world, I'm switching
to supporting PEP 382 as my preferred approach to namespace packages.
In this case, I think "explicit is better than implicit" means, "given
only a filesystem hierarchy, you should be able to figure out the
Python package hierarchy it contains". Only explicit markers (either
files or extensions) let you do that - with PEP 402, the filesystem
doesn't contain enough information to figure it out, you need to also
know the contents of sys.path.

Regards,
Nick.

[1] http://www.python.org/dev/peps/pep-0395/#fixing-direct-execution-inside-packages
-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ericsnowcurrently at gmail.com  Wed Nov 16 09:15:03 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 16 Nov 2011 01:15:03 -0700
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
Message-ID: <CALFfu7BbQyyGpc94Bb2YUtBj3B3ayqBOMK9iryRy1k43sCoLug@mail.gmail.com>

On Tue, Nov 15, 2011 at 11:29 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> One of the fixes PEP 395 (module aliasing) proposes is to make running
> modules inside packages by filename work correctly (i.e. without
> breaking relative imports and without getting the directory where the
> module lives directly on sys.path which can lead to unexpected name
> clashes). The PEP currently states [1] that this can be made to work
> with both PEP 382 and PEP 402
>
> In current Python, fixing this just involves checking for a colocated
> __init__.py file. If we find one, then we work our way up the
> directory hierarchy until we find a directory without an __init__.py
> file, put *that* on sys.path, then (effectively) rewrite the command
> line as if the -m switch had been used.
>
> The extension to the current version of PEP 382 is clear - we just
> accept both an __init__.py file and a .pyp extension as indicating
> "this is part of a Python package", but otherwise the walk back up the
> filesystem hierarchy to decide which directory to add to sys.path
> remains unchanged.
>
> However, I'm no longer convinced that this concept can actually be
> made to work in the context of PEP 402:
>
> 1. We can't use sys.path, since we're trying to figure out which
> directory we want to *add* to sys.path
> 2. We can't use "contains a Python module", since PEP 402 allows
> directories inside packages that only contain subpackages (only the
> leaf directories are required to contain valid Python modules), so we
> don't know the significance of an empty directory without already
> knowing what is on sys.path!
>
> So, without a clear answer to the question of "from module X, inside
> package (or package portion) Y, find the nearest parent directory that
> should be placed on sys.path" in a PEP 402 based world, I'm switching
> to supporting PEP 382 as my preferred approach to namespace packages.
> In this case, I think "explicit is better than implicit" means, "given
> only a filesystem hierarchy, you should be able to figure out the
> Python package hierarchy it contains". Only explicit markers (either
> files or extensions) let you do that - with PEP 402, the filesystem
> doesn't contain enough information to figure it out, you need to also
> know the contents of sys.path.

Ouch.  What about the following options?

Indicator for the top-level package?  No
Leverage __pycache__?  No

Merge in the idea from PEP 382 of special directory names?  To borrow
an example from PEP 3147:

alpha.pyp/
    one.py
    two.py
    beta.py
    beta.pyp/
        three.py
        four.py

So package directories are explicitly marked but PEP 402 otherwise
continues as-is.  I'll have to double-check, but I don't think we
tried this angle already.

-eric


>
> Regards,
> Nick.
>
> [1] http://www.python.org/dev/peps/pep-0395/#fixing-direct-execution-inside-packages
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>

From pje at telecommunity.com  Wed Nov 16 16:08:56 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 16 Nov 2011 10:08:56 -0500
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
Message-ID: <CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>

On Wed, Nov 16, 2011 at 1:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> So, without a clear answer to the question of "from module X, inside
> package (or package portion) Y, find the nearest parent directory that
> should be placed on sys.path" in a PEP 402 based world, I'm switching
> to supporting PEP 382 as my preferred approach to namespace packages.
> In this case, I think "explicit is better than implicit" means, "given
> only a filesystem hierarchy, you should be able to figure out the
> Python package hierarchy it contains". Only explicit markers (either
> files or extensions) let you do that - with PEP 402, the filesystem
> doesn't contain enough information to figure it out, you need to also
> know the contents of sys.path.
>

After spending an hour or so reading through PEP 395 and trying to grok
what it's doing, I actually come to the opposite conclusion: that PEP 395
is violating the ZofP by both guessing, and not encouraging One Obvious Way
of invoking scripts-as-modules.

For example, if somebody adds an __init__.py to their project directory,
suddenly scripts that worked before will behave differently under PEP 395,
creating a strange bit of "spooky action at a distance".  (And yes, people
add __init__.py files to their projects in odd places -- being setuptools
maintainer, you get to see a LOT of weird looking project layouts.)

While I think the __qname__ idea is fine, and it'd be good to have a way to
avoid aliasing main (suggestion for how included below), I think that
relative imports failing from inside a main module should offer an error
message suggesting you use "-m" if you're running a script that's within a
package, since that's the One Obvious Way of running a script that's also a
module.  (Albeit not obvious unless you're Dutch.  ;-) )

For the import aliasing case, AFAICT it's only about cases where __name__
== '__main__', no?  Why not just save the file/importer used for __main__,
and then have the import machinery check whether a module being imported is
about to alias __main__?  For that, you don't need to know in *advance*
what the qualified name of __main__ is - you just spot it the first time
somebody re-imports it.

I think removing qname-quessing from PEP 395 (and replacing it with
instructive/google-able error messages) would be an unqualified
improvement, independent of what happens to PEPs 382 and 402.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111116/df3c7be9/attachment.html>

From ericsnowcurrently at gmail.com  Wed Nov 16 19:21:22 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 16 Nov 2011 11:21:22 -0700
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
Message-ID: <CALFfu7Bf4G4bDBDO7OweUY4eHAP49_tXCn1-6B1TmGonQjcXkQ@mail.gmail.com>

On Wed, Nov 16, 2011 at 8:08 AM, PJ Eby <pje at telecommunity.com> wrote:
> On Wed, Nov 16, 2011 at 1:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> So, without a clear answer to the question of "from module X, inside
>> package (or package portion) Y, find the nearest parent directory that
>> should be placed on sys.path" in a PEP 402 based world, I'm switching
>> to supporting PEP 382 as my preferred approach to namespace packages.
>> In this case, I think "explicit is better than implicit" means, "given
>> only a filesystem hierarchy, you should be able to figure out the
>> Python package hierarchy it contains". Only explicit markers (either
>> files or extensions) let you do that - with PEP 402, the filesystem
>> doesn't contain enough information to figure it out, you need to also
>> know the contents of sys.path.
>
> After spending an hour or so reading through PEP 395 and trying to grok what
> it's doing, I actually come to the opposite conclusion: that PEP 395 is
> violating the ZofP by both guessing, and not encouraging One Obvious Way of
> invoking scripts-as-modules.
> For example, if somebody adds an __init__.py to their project directory,
> suddenly scripts that worked before will behave differently under PEP 395,
> creating a strange bit of "spooky action at a distance". ?(And yes, people
> add __init__.py files to their projects in odd places -- being setuptools
> maintainer, you get to see a LOT of weird looking project layouts.)
> While I think the __qname__ idea is fine, and it'd be good to have a way to
> avoid aliasing main (suggestion for how included below), I think that
> relative imports failing from inside a main module should offer an error
> message suggesting you use "-m" if you're running a script that's within a
> package, since that's the One Obvious Way of running a script that's also a
> module. ?(Albeit not obvious unless you're Dutch. ?;-) )
> For the import aliasing case,?AFAICT it's only about cases where __name__ ==
> '__main__', no? ?Why not just save the file/importer used for __main__, and
> then have the import machinery check whether a module being imported is
> about to alias __main__? ?For that, you don't need to know in *advance* what
> the qualified name of __main__ is - you just spot it the first time somebody
> re-imports it.
> I think removing qname-quessing from PEP 395 (and replacing it with
> instructive/google-able error messages) would be an unqualified improvement,
> independent of what happens to PEPs 382 and 402.

But which is more astonishing (POLA and all that): running your module
in Python, it behaves differently than when you import it (especially
__name__); or you add an __init__.py to a directory and your *scripts*
there start to behave differently?

When I was learning Python, it took quite a while before I realized
that modules are imported and scripts are passed at the commandline;
and to understand the whole __main__ thing.  It has always been a
pain, particularly when I wanted to
 just check a module really quickly for errors.

However, lately I've actually taken to the idea that it's better to
write a test script that imports the module and running that, rather
than running the module itself.  But that came with the understanding
that the module doesn't go through the import machinery when you *run*
it, which I don't think is obvious, particularly to beginners.  So
Nick's solution, to me, is an appropriate concession to the reality
that most folks will expect Python to treat their modules like modules
and their scripts like scripts.

Still, this actually got me wishing there were a way to customize
script-running the same way you can customize import with __import__
and import hooks.

-eric


>
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>
>

From pje at telecommunity.com  Wed Nov 16 21:06:51 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 16 Nov 2011 15:06:51 -0500
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALFfu7Bf4G4bDBDO7OweUY4eHAP49_tXCn1-6B1TmGonQjcXkQ@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CALFfu7Bf4G4bDBDO7OweUY4eHAP49_tXCn1-6B1TmGonQjcXkQ@mail.gmail.com>
Message-ID: <CALeMXf7gmyFuTJefWsmg0hS2k74QPDpRHLr-pP_oYck0Ai07nQ@mail.gmail.com>

On Wed, Nov 16, 2011 at 1:21 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:

> But which is more astonishing (POLA and all that): running your module
> in Python, it behaves differently than when you import it (especially
> __name__); or you add an __init__.py to a directory and your *scripts*
> there start to behave differently?
>

To me it seems that the latter is more astonishing because there's less
connection between your action and the result.  If you're running something
differently, it makes more sense that it acts differently, because you've
changed what you're *doing*.  In the scripts case, you haven't changed how
you run the scripts, and you haven't changed the scripts, so the change in
behavior seems to appear out of nowhere.


When I was learning Python, it took quite a while before I realized
> that modules are imported and scripts are passed at the commandline;
> and to understand the whole __main__ thing.


It doesn't seem to me that PEP 395 fixes this problem.  In order to
*actually* fix it, we'd need to have some sort of "package" statement like
in other languages - then you'd declare right there in the code what
package it's supposed to be part of.



>  It has always been a pain, particularly when I wanted to
>  just check a module really quickly for errors.
>

What, specifically, was a pain?  That information might be of more use in
determining a solution.

If you mean that you had other modules importing the module that was also
__main__, then I agree that having a solution for __main__-aliasing is a
good idea.  I just think it might be more cleanly fixed by checking whether
the __file__ of a to-be-imported module is going to end up matching
__main__.__file__, and if so, alias __main__ instead.



> However, lately I've actually taken to the idea that it's better to
> write a test script that imports the module and running that, rather
> than running the module itself.  But that came with the understanding
> that the module doesn't go through the import machinery when you *run*
> it, which I don't think is obvious, particularly to beginners.  So
> Nick's solution, to me, is an appropriate concession to the reality
> that most folks will expect Python to treat their modules like modules
> and their scripts like scripts.
>

You lost me there: if most people don't understand the difference, then why
are they expecting a difference?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111116/d0c4c806/attachment.html>

From ncoghlan at gmail.com  Wed Nov 16 23:41:08 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 08:41:08 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
Message-ID: <CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>

On Thu, Nov 17, 2011 at 1:08 AM, PJ Eby <pje at telecommunity.com> wrote:
> On Wed, Nov 16, 2011 at 1:29 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> So, without a clear answer to the question of "from module X, inside
>> package (or package portion) Y, find the nearest parent directory that
>> should be placed on sys.path" in a PEP 402 based world, I'm switching
>> to supporting PEP 382 as my preferred approach to namespace packages.
>> In this case, I think "explicit is better than implicit" means, "given
>> only a filesystem hierarchy, you should be able to figure out the
>> Python package hierarchy it contains". Only explicit markers (either
>> files or extensions) let you do that - with PEP 402, the filesystem
>> doesn't contain enough information to figure it out, you need to also
>> know the contents of sys.path.
>
> After spending an hour or so reading through PEP 395 and trying to grok what
> it's doing, I actually come to the opposite conclusion: that PEP 395 is
> violating the ZofP by both guessing, and not encouraging One Obvious Way of
> invoking scripts-as-modules.
> For example, if somebody adds an __init__.py to their project directory,
> suddenly scripts that worked before will behave differently under PEP 395,
> creating a strange bit of "spooky action at a distance". ?(And yes, people
> add __init__.py files to their projects in odd places -- being setuptools
> maintainer, you get to see a LOT of weird looking project layouts.)
>
> While I think the __qname__ idea is fine, and it'd be good to have a way to
> avoid aliasing main (suggestion for how included below), I think that
> relative imports failing from inside a main module should offer an error
> message suggesting you use "-m" if you're running a script that's within a
> package, since that's the One Obvious Way of running a script that's also a
> module. ?(Albeit not obvious unless you're Dutch. ?;-) )

The -m switch is not always an adequate replacement for direct
execution, because it relies on the current working directory being
set correctly (or else the module to be executed being accessible via
sys.path, and there being nothing in the current directory that will
shadow modules that you want to import). Direct execution will always
have the advantage of allowing you more explicit control over all of
sys.path[0], sys.argv[0] and __main__.__file__. The -m switch, on the
other hand, will always set sys.path[0] to the empty string, which may
not be what you really want.

If the package directory markers are explicit (as they are now and as
they are in PEP 382), then PEP 395 isn't guessing - the mapping from
the filesystem layout to the Python module namespace is completely
unambiguous, since the directory added as sys.path[0] will always be
the first parent directory that isn't marked as a package directory:

    # Current rule
    sys.path[0] = os.path.abspath(os.path.dirname(__main__.__file__))

    # PEP 395 rule
    path0 = os.path.abspath(os.path.dirname(__main__.__file__))
    while is_package_dir(path0):
        path0 = os.path.dirname(path0)
    sys.path[0] = path0

In fact, both today and under PEP 382, we could fairly easily provide
a "runpy.split_path_module()" function that converts an arbitrary
filesystem path to the corresponding python module name and sys.path
entry:

    def _splitmodname(fspath):
        path_entry, fname = os.path.split(fspath)
        modname = os.path.splitext(fname)[0]
        return path_entry, modname

    # Given appropriate definitions for "is_module_or_package" and
"has_init_file"...
    def split_path_module(fspath):
        if not is_module_or_package(fspath):
            raise ValueError("{!r} is not recognized as a Python
module".format(filepath))
        path_entry, modname = _splitmodname(fspath)
        while path_entry.endswith(".pyp") or has_init_file(path_entry):
            path_entry, pkg_name = _splitmodname(path_entry)
            modname = pkg_name + '.' + modname
        return modname, path_entry

As far as the "one obvious way" criticism goes, I think the obvious
way (given PEP 395) is clear:

1. Do you have a filename? Just run it and Python will figure out
where it lives in the module namespace
2. Do you have a module name? Run it with the -m switch and Python
will figure out where it lives on the filesystem

runpy.run_path() corresponds directly to 1, runpy.run_module()
corresponds directly to 2.

Currently, if you have a filename, just running it is sometimes the
*wrong thing to do*, because it may point inside a package directory.
But you have no easy way to tell if that is the case. Under PEP 402,
you simply *can't* tell, as the filesystem no longer contains enough
information to provide an unambiguous mapping to the Python module
namespace - instead, the intended mapping depends not only on the
filesystem contents, but also on the runtime configuration of
sys.path.

> For the import aliasing case,?AFAICT it's only about cases where __name__ ==
> '__main__', no? ?Why not just save the file/importer used for __main__, and
> then have the import machinery check whether a module being imported is
> about to alias __main__? ?For that, you don't need to know in *advance* what
> the qualified name of __main__ is - you just spot it the first time somebody
> re-imports it.

Oh, I like that idea - once __main__.__qname__ is available, you could
just have a metapath hook along the lines of the following:

  class MainImporter:
    def __init__(self):
        main = sys.modules.get("__main__", None):
        self.main_qname = getattr(main, "__qname__", None)

    def find_module(self, fullname, path=None):
        if fullname == self.main_qname:
            return self
        return None

    def load_module(self, fullname):
        return sys.modules["__main__"]

> I think removing qname-quessing from PEP 395 (and replacing it with
> instructive/google-able error messages) would be an unqualified improvement,
> independent of what happens to PEPs 382 and 402.

Even if the "just do what I mean" part of the proposal in PEP 395 is
replaced by a "Did you mean?" error message, PEP 382 still offers the
superior user experience, since we could use runpy.split_path_module()
to state the *exact* argument to -m that should have been used. Of
course, that still wouldn't get sys.path[0] set correctly, so it isn't
a given that it would really help.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ericsnowcurrently at gmail.com  Wed Nov 16 23:41:34 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 16 Nov 2011 15:41:34 -0700
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf7gmyFuTJefWsmg0hS2k74QPDpRHLr-pP_oYck0Ai07nQ@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CALFfu7Bf4G4bDBDO7OweUY4eHAP49_tXCn1-6B1TmGonQjcXkQ@mail.gmail.com>
	<CALeMXf7gmyFuTJefWsmg0hS2k74QPDpRHLr-pP_oYck0Ai07nQ@mail.gmail.com>
Message-ID: <CALFfu7A9XD4Yh_NJfwkOYyEnYkNSziksw_bMsnQeagEzg7er3w@mail.gmail.com>

On Wed, Nov 16, 2011 at 1:06 PM, PJ Eby <pje at telecommunity.com> wrote:
> On Wed, Nov 16, 2011 at 1:21 PM, Eric Snow <ericsnowcurrently at gmail.com>
> wrote:
>>
>> But which is more astonishing (POLA and all that): running your module
>> in Python, it behaves differently than when you import it (especially
>> __name__); or you add an __init__.py to a directory and your *scripts*
>> there start to behave differently?
>
> To me it seems that the latter is more astonishing because there's less
> connection between your action and the result. ?If you're running something
> differently, it makes more sense that it acts differently, because you've
> changed what you're *doing*. ?In the scripts case, you haven't changed how
> you run the scripts, and you haven't changed the scripts, so the change in
> behavior seems to appear out of nowhere.

Well, then I suppose both are astonishing and, for me at least, the
module-as-script side of it has bit me more.  Regardless, both are a
consequence of the script vs. module situation.

>
>>
>> When I was learning Python, it took quite a while before I realized
>> that modules are imported and scripts are passed at the commandline;
>> and to understand the whole __main__ thing.
>
> It doesn't seem to me that PEP 395 fixes this problem. ?In order to
> *actually* fix it, we'd need to have some sort of "package" statement like
> in other languages - then you'd declare right there in the code what package
> it's supposed to be part of.

Certainly an effective indicator that a file's a module and not a
script.  Still, I'd rather we find a way to maintain the
filesystem-based package approach we have now.  It's nice not having
to look in each file to figure out the package it belongs to or if
it's a script or not.

The consequence is that a package that's spread across multiple
directories is likewise addressed through the filesystem, hence PEPs
382 and 402.  However, the namespace package issue is a separate one
from script-vs-module.

>
>>
>> ?It has always been a?pain, particularly when I wanted to
>> ?just check a module really quickly for errors.
>
> What, specifically, was a pain? ?That information might be of more use in
> determining a solution.
>
> If you mean that you had other modules importing the module that was also
> __main__, then I agree that having a solution for __main__-aliasing is a
> good idea.

PEP 395 spells out several pretty well.  Additionally, running a
module as a script can cause trouble if your module otherwise relies
on the value of __name__.  Finally, sometimes I rely on a module
triggering an import hook, though that is likely a problem just for
me.

> ?I just think it might be more cleanly fixed by checking whether
> the __file__ of a to-be-imported module is going to end up matching
> __main__.__file__, and if so, alias __main__ instead.

Currently the only promise regarding __file__ is that it will be set
on module object once the module has been loaded but before the
implicit binding for the import statement.  So, unless I'm mistaken,
that would have to change to allow for import hooks.  Otherwise, sure.

>
>>
>> However, lately I've actually taken to the idea that it's better to
>> write a test script that imports the module and running that, rather
>> than running the module itself. ?But that came with the understanding
>> that the module doesn't go through the import machinery when you *run*
>> it, which I don't think is obvious, particularly to beginners. ?So
>> Nick's solution, to me, is an appropriate concession to the reality
>> that most folks will expect Python to treat their modules like modules
>> and their scripts like scripts.
>
> You lost me there: if most people don't understand the difference, then why
> are they expecting a difference?
>

Yeah, that wasn't clear.  :)

When someone learns Python, they probably are not going to recognize
the difference between running their module and importing it.  They'll
expect their module to work identically if run as a script or
imported.  They won't even think about the distinction.  Or maybe I'm
really out of touch (quite possible :).

It'll finally bite them when they implicitly or explicitly rely on the
module state set by the import machinery (__name__, __file__, etc.),
or on customization of that machinery (a la import hooks).

Educating developers on the distinction between scripts and modules is
good, but it seems like PEP 395 is trying to bring the behavior more
in line with the intuitive behavior, which sounds good to me.

Regarding the PEP 402 conflict, if using .pyp on directory names
addresses Nick's concern, would you be opposed to that solution?

-eric

p.s. where should I bring up general discussion on PEP 395?

From ncoghlan at gmail.com  Wed Nov 16 23:44:07 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 08:44:07 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALFfu7A9XD4Yh_NJfwkOYyEnYkNSziksw_bMsnQeagEzg7er3w@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CALFfu7Bf4G4bDBDO7OweUY4eHAP49_tXCn1-6B1TmGonQjcXkQ@mail.gmail.com>
	<CALeMXf7gmyFuTJefWsmg0hS2k74QPDpRHLr-pP_oYck0Ai07nQ@mail.gmail.com>
	<CALFfu7A9XD4Yh_NJfwkOYyEnYkNSziksw_bMsnQeagEzg7er3w@mail.gmail.com>
Message-ID: <CADiSq7fvbjA1nY3qYSSvbr81sF9tmTmEEXrGT5yWuCY2ksHTFw@mail.gmail.com>

On Thu, Nov 17, 2011 at 8:41 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> p.s. where should I bring up general discussion on PEP 395?

import-sig for now - it needs more thought before I take it back to python-dev.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From pje at telecommunity.com  Thu Nov 17 01:10:06 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 16 Nov 2011 19:10:06 -0500
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
Message-ID: <CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>

On Wed, Nov 16, 2011 at 5:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> If the package directory markers are explicit (as they are now and as
> they are in PEP 382), then PEP 395 isn't guessing - the mapping from
> the filesystem layout to the Python module namespace is completely
> unambiguous, since the directory added as sys.path[0] will always be
> the first parent directory that isn't marked as a package directory:
>

Sorry, but that's *still guessing*.  Random extraneous __init__.py and
subdirectories on sys.path can screw you over.  For example, if I have a
stray __init__.py in site-packages, does that mean that every module there
is a submodule of a package called 'site-packages'?

Sure, you could fix that problem by ignoring names with a '-', but that's
just an illustration.  The __init__.py idea was a very good attempt at
solving the problem, but even in today's Python, it's still ambiguous and
we should refuse to guess.  (Because it will result in weird behavior
that's *much* harder to debug.)

Import aliasing detection and relative import errors, on the other hand,
don't rely on guessing.


Even if the "just do what I mean" part of the proposal in PEP 395 is
> replaced by a "Did you mean?" error message, PEP 382 still offers the
> superior user experience, since we could use runpy.split_path_module()
> to state the *exact* argument to -m that should have been used.


No, what you get is just a *guess* as to the correct directory.  (And you
can make similar guesses under PEP 402, if a parent directory of the script
is already on sys.path.)



> Of
> course, that still wouldn't get sys.path[0] set correctly, so it isn't
> a given that it would really help.
>

Right; and if you already *have* a correct sys.path, then you can make just
as good a guess under PEP 402.

Don't get me wrong - I'm all in favor of further confusion-reduction (which
is what PEP 402's about, after all).  I'm just concerned that PEP 395 isn't
really clear about the tradeoffs, in the same way that PEP 382 was unclear
back when I started doing all those proposed revisions leading up to PEP
402.

That is, like early PEP 382, ISTM that it's an initial implementation
attempt to solve a problem by patching over it, rather than an attempt to
think through "how things are" and "how they ought to be".  I think some of
that sort of thinking ought to be done, to see if perhaps there's a better
tradeoff to be had.

For one thing, I wonder about the whole scripts-as-modules thing.  In other
scripting languages AFAICT it's not very common to have a script as a
module; there's a pretty clear delineation between the two, because
Python's about the only language with the name==main paradigm.  In
languages that have some sort of "main" paradigm, it's usually a specially
named function or class method (Java) or whatever.

So, I'm wondering a bit about the detailed use cases people have about
using modules as scripts and vice versa.  Are they writing scripts, then
turning them into modules?  Trying to run somebody else's modules?  Copying
example code from somewhere?

(The part that confuses me is, if you *know* there's a difference between a
script and a module, then presumably you either know about __name__, OR you
wouldn't have any reason to run your module as a script.  Conversely, if
you don't know about __name__, then how would you conceive of making your
script into a module?  ISTM that in order to even have this problem you
have to at least be knowledgeable enough to realize there's *some*
difference between moduleness and scriptness.)

Anyway, understanding the *details* of this process (of how people end up
making the sort of errors PEP 395 aims to address) seems important to me
for pinning down precisely what problem to solve and how.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111116/d63d7537/attachment.html>

From ncoghlan at gmail.com  Thu Nov 17 02:47:46 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 11:47:46 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
Message-ID: <CADiSq7cW_G6N+q_dpameC=i7gNJ3=YKoLxQW1uXnkAyK_0=O6Q@mail.gmail.com>

On Thu, Nov 17, 2011 at 10:10 AM, PJ Eby <pje at telecommunity.com> wrote:
> So, I'm wondering a bit about the detailed use cases people have about using
> modules as scripts and vice versa. ?Are they writing scripts, then turning
> them into modules? ?Trying to run somebody else's modules? ?Copying example
> code from somewhere?
> (The part that confuses me is, if you *know* there's a difference between a
> script and a module, then presumably you either know about __name__, OR you
> wouldn't have any reason to run your module as a script. ?Conversely, if you
> don't know about __name__, then how would you conceive of making your script
> into a module? ?ISTM that in order to even have this problem you have to at
> least be knowledgeable enough to realize there's *some* difference between
> moduleness and scriptness.)
> Anyway, understanding the *details* of this process (of how people end up
> making the sort of errors PEP 395 aims to address) seems important to me for
> pinning down precisely what problem to solve and how.

The module->script process comes from wanting to expose useful command
line functionality from a Python module in a cross-platform way
without any additional packaging effort (as exposing system-native
scripts is a decidedly *non* trivial task, and also doesn't work from
a source checkout).

The genesis was actually the timeit module - "python -m timeit" is now
the easiest way to run short benchmarking snippets.

A variety of other standard library modules also offer useful "-m"
functionality - "-m site" will dump diagnostic info regarding your
path setup, "-m smptd" will run up a local SMTP server, "-m unittest"
and "-m doctest" can be used to run tests, "-m pdb" can be used to
invoke the debugger, "-m pydoc" will run pydoc as usual. (A more
comprehensive list is below, but it's also worth caveating this list
with Raymond's comments on http://bugs.python.org/issue11260)

Third party wise, I've mostly seen "-m" support used for "scripts that
run scripts" - tools like pychecker, coverage and so forth are
naturally Python version specific, and running them via -m rather than
directly automatically deals with those scoping issues.

It's also fairly common for test definition modules to support
execution via "-m" (by invoking unittest.main() from an "if __name__"
guarded suite).

Cheers,
Nick.

====================
Top level stdlib modules with meaningful "if __name__ == '__main__':" blocks:

base64.py - CLI for base64 encoding/decoding
calendar.py - CLI to display text calendars
cgi.py - displays some example CGI output
code.py - code-based interactive interpreter
compileall.py - CLI for bytecode file generation
cProfile.py - profile a script with cProfile
dis.py - CLI for file disassembly
doctest.py - CLI for doctest execution
filecmp.py - CLI to compare directory contents
fileinput.py - line numbered file display
formatter.py - reformats text and prints to stdout
ftplib.py - very basic CLI for FTP
gzip.py - basic CLI for creation of gzip files
imaplib.py - basic IMAP client (localhost only)
imghdr.py - scan a directory looking for valid image headers
mailcap.py - display system mailcap config info
mimetypes.py - CLI for querying mimetypes (but appears broken)
modulefinder.py - dump list of all modules referenced (directly or
indirectly) from a Python file
netrc.py - dump netrc config (I think)
nntplib.py - basic CLI for nntp
pdb.py - debug a script
pickle.py - dumps the content of a pickle file
pickletools.py - prettier dump of pickle file contents
platform.py - display platform info (e.g.
Linux-3.1.1-1.fc16.x86_64-x86_64-with-fedora-16-Verne)
profile.py - profile a script with profile
pstats.py - CLI to browse profile stats
pydoc.py - same as the installed pydoc script
quopri.py - CLI for quoted printable encoding/decoding
runpy.py - Essentially an indirect way to do what -m itself already does
shlex.py - runs the lexer over the specified file
site.py - dumps path config information
smtpd.py - local SMTP server
sndhdr.py - scan a directory looking for valid audio headers
sysconfig.py - dumps system configuration details
tabnanny.py - CLI to scan files
telnetlib.py - very basic telnet CLI
timeit.py - CLI to time snippets of code
tokenize.py - CLI to tokenize files
turtle.py - runs turtle demo (appears to be broken in trunk, though)
uu.py - CLI for UUencode encoding/decoding
webbrowser.py - CLI to launch a web browser
zipfile.py - basic CLI for zipfile creation and inspection

Not sure (no help text, no clear purpose without looking at the code):
aifc.py - dump info about AIFF files?
codecs.py
decimal.py
difflib.py
getopt.py - manual sanity check?
heapq.py
inspect.py
keyword.py - only valid in source checkout
macurl2path.py - manual sanity check?
poplib.py - simple POP3 client?
pprint.py
pyclbr.py - dump classes defined in files?
py_compile.py
random.py - manual sanity check?
smtplib.py
sre_constants.py - broken on Py3k!
symbol.py - only valid in source checkout, broken on Py3k
symtable.py - manual sanity check?
textwrap.py - manual sanity check?
token.py - only valid in source checkout

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu Nov 17 02:52:32 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 11:52:32 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
Message-ID: <CADiSq7dLK7xjFOJubGgvJaXGxxY-5z8Cobchf7jNqNKrSo1zfA@mail.gmail.com>

On Thu, Nov 17, 2011 at 10:10 AM, PJ Eby <pje at telecommunity.com> wrote:
> On Wed, Nov 16, 2011 at 5:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> If the package directory markers are explicit (as they are now and as
>> they are in PEP 382), then PEP 395 isn't guessing - the mapping from
>> the filesystem layout to the Python module namespace is completely
>> unambiguous, since the directory added as sys.path[0] will always be
>> the first parent directory that isn't marked as a package directory:
>
> Sorry, but that's *still guessing*. ?Random extraneous __init__.py and
> subdirectories on sys.path can screw you over. ?For example, if I have a
> stray __init__.py in site-packages, does that mean that every module there
> is a submodule of a package called 'site-packages'?

Yes. (although in that case, you'd error out, since the package name
isn't valid).

Errors should never pass silently - ignoring such a screw-up in their
filesystem layout is letting an error pass silently and will most
likely cause obscure problems further down the road.

> Sure, you could fix that problem by ignoring names with a '-', but that's
> just an illustration. ?The __init__.py idea was a very good attempt at
> solving the problem, but even in today's Python, it's still ambiguous and we
> should refuse to guess. ?(Because it will result in weird behavior that's
> *much* harder to debug.)
> Import aliasing detection and relative import errors, on the other hand,
> don't rely on guessing.

Umm, if people screw up their filesystem layouts and *lie* to the
interpreter about whether or not something is a package, how is that
our fault? "Oh, they told me something, but they might not mean it, so
I'll choose to ignore the information they've given me" is the part
that sounds like guessing to me.

If we error *immediately*, telling them what's wrong with their
filesystem, that's the *opposite* of guessing.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From pje at telecommunity.com  Thu Nov 17 03:00:20 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 16 Nov 2011 21:00:20 -0500
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CADiSq7cW_G6N+q_dpameC=i7gNJ3=YKoLxQW1uXnkAyK_0=O6Q@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
	<CADiSq7cW_G6N+q_dpameC=i7gNJ3=YKoLxQW1uXnkAyK_0=O6Q@mail.gmail.com>
Message-ID: <CALeMXf7A9m_Bd3cksyrPyaaXiHPdwcekNRqwpgmR0_Jp76R7EQ@mail.gmail.com>

On Wed, Nov 16, 2011 at 8:47 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Thu, Nov 17, 2011 at 10:10 AM, PJ Eby <pje at telecommunity.com> wrote:
> > So, I'm wondering a bit about the detailed use cases people have about
> using
> > modules as scripts and vice versa.  Are they writing scripts, then
> turning
> > them into modules?  Trying to run somebody else's modules?  Copying
> example
> > code from somewhere?
> > (The part that confuses me is, if you *know* there's a difference
> between a
> > script and a module, then presumably you either know about __name__, OR
> you
> > wouldn't have any reason to run your module as a script.  Conversely, if
> you
> > don't know about __name__, then how would you conceive of making your
> script
> > into a module?  ISTM that in order to even have this problem you have to
> at
> > least be knowledgeable enough to realize there's *some* difference
> between
> > moduleness and scriptness.)
> > Anyway, understanding the *details* of this process (of how people end up
> > making the sort of errors PEP 395 aims to address) seems important to me
> for
> > pinning down precisely what problem to solve and how.
>
> The module->script process comes from wanting to expose useful command
> line functionality from a Python module in a cross-platform way
> without any additional packaging effort (as exposing system-native
> scripts is a decidedly *non* trivial task, and also doesn't work from
> a source checkout).
>

No, I mean how do the people who PEP 395 is supposed to be helping, find
out that they even want to run a script as a module?

Or are you saying that the central use case the PEP is aimed at is running
stdlib modules?  ;-)



> It's also fairly common for test definition modules to support
> execution via "-m" (by invoking unittest.main() from an "if __name__"
> guarded suite).
>

Right...  so are these modules not *documented* as being run by -m?  Are
people running them as scripts by mistake?

I'm still not seeing how people end up making their own scripts into
modules or  vice versa, *without* some explicit documentation about the
process.  I mean, how do you even know that a file can be both, without
realizing that there's a difference between the two?

The most common confusion I've seen among newbies is the ones who don't
grok that module != file.  That is, they don't understand why you replace
directory separators with '.' (which is how they think of it) or they want
to use exec/runfile instead of import, or they expect import to run the
code, or similar confusions of "file" and "module".

However, I don't grok how people with *that* confusion would end up writing
code that has a problem when run as a combination script/module, because
they already think scripts and modules are the same thing and are rather
unlikely to create a package in the first place.

So who *is* PEP 395's target audience, and what is their mental model?
 That's the question I'd like to come to grips with before proposing a full
solution.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111116/5c81ffcf/attachment.html>

From ncoghlan at gmail.com  Thu Nov 17 04:50:34 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 13:50:34 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf7A9m_Bd3cksyrPyaaXiHPdwcekNRqwpgmR0_Jp76R7EQ@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
	<CADiSq7cW_G6N+q_dpameC=i7gNJ3=YKoLxQW1uXnkAyK_0=O6Q@mail.gmail.com>
	<CALeMXf7A9m_Bd3cksyrPyaaXiHPdwcekNRqwpgmR0_Jp76R7EQ@mail.gmail.com>
Message-ID: <CADiSq7f2SDvcU3OfVphe8VD-NQOCVM+8qE5J8=50q4V+JKzpCw@mail.gmail.com>

On Thu, Nov 17, 2011 at 12:00 PM, PJ Eby <pje at telecommunity.com> wrote:
> So who *is* PEP 395's target audience, and what is their mental model?
> ?That's the question I'd like to come to grips with before proposing a full
> solution.

OK, I realised that the problem I want to solve with this part of the
PEP isn't limited to direct execution of scripts - It's a general
problem with figuring out an appropriate value for sys.path[0] that
also affects the interactive interpreter and the -m switch.

The "mission statement" for this part of PEP 395 is then clearly
stated as: the Python interpreter should *never* automatically place a
Python package directory on sys.path.

Adding package directories to sys.path creates undesirable aliasing
that may lead to multiple imports of the same module under different
names, unexpected shadowing of standard library (and other) modules
and packages, and frequently confusing errors where a module works
when imported but not when executed directly and vice-versa. Letting
the import system get into that state without even a warning is
letting an error pass silently and we shouldn't do it.

However, it's also true that, in many cases, this slight error in the
import state is actually harmless, so *always* failing in this
situation would be an unacceptable breach of backwards compatibility.
While we could issue a warning and demand that the user fix it
themselves (by invoking Python differently), there's no succinct way
to explain what has gone wrong - it depends on a fairly detailed
understanding of how import system gets initialised. And, as noted,
there isn't actually a easy mechanism for users to currently fix it
themselves in the general case - using the -m switch means also you
have to get the current working directory right, losing out on one of
the main benefits of direct execution. And such a warning is assuredly
useless if you actually ran the script by double-clicking it in a file
browser...

Accordingly, PEP 395 proposes that, when such a situation is
encountered, Python should just use the nearest containing
*non*-package directory as sys.path[0] rather than naively blundering
ahead and corrupting the import system state, regardless of how the
proposed value for sys.path[0] was determined (i.e. the current
working directory or the location of a specific Python file). Any
module that currently worked correctly in this situation should
continue to work, and many others that previously failed (because they
were inside packages) will start to work. The only new failures will
be early detection of invalid filesystem layouts, such as
"__init__.py" files in directories that are not valid Python package
names, and scripts stored inside package directories that *only* work
as scripts (effectively relying on the implicit relative imports that
occur due to __name__ being set to "__main__").

This problem most often arises during development (*not* after
deployment), when developers either start python to perform some
experiments, or place quick tests or sanity checks in "if __name__ ==
'__main__':" blocks at the end of their modules (this is a common
practice, described in a number of Python tutorials. Our own docs also
recommend this practice for test modules:
http://docs.python.org/library/unittest#basic-example).

The classic example from Stack Overflow looked like this:

    project/
        package/
            __init__.py
            foo.py
            tests/
                __init__.py
                test_foo.py

Currently, the *only* correct way to invoke test_foo is with "project"
as the current working directory and the command "python -m
package.tests.test_foo". Anything else (such as "python
package/tests/test_foo.py", ./package/tests/test_foo.py", clicking the
file in a file browser or, while in the tests directory, invoking
"python test_foo.py", "./test_foo.py" or "python -m test_foo") will
still *try* to run test_foo, but fail in a completely confusing
manner.

If test_foo uses absolute imports, then the error will generally be
"ImportError: No module named package", if it uses explicit relative
imports, then the error will be "ValueError: Attempted relative import
in non-package". Neither of these is going to make any sense to a
novice Python developer, but there isn't any obvious way to make those
messages self-explanatory (they're completely accurate, they just
involve a lot of assumed knowledge regarding how the import system
works and sys.path gets initialised).

If foo.py is set up to invoke its own test suite:

    if __name__ == "__main__":
        import unittest
        from .tests import test_foo
        unittest.main(test_foo.__name__)

Then you can get similarly confusing errors when attempting to run foo itself.

However, those errors are comparatively obvious compared to the
AttributeErrors (and ImportErrors) that can arise if you get
unexpected name shadowing. For example, suppose you have a helper
module called "package.json" for dealing with JSON serialisation in
your library, and you start an interactive session while in the
package directory, or attempting to invoke 'foo.py' directly in order
to run its test suite (as described above). Now "import json" is
giving you the version from your package, even though that version is
*supposed* to be safely hidden away inside your package namespace. By
silently allowing a package directory onto sys.path, we're doing our
users a grave disservice.

So my perspective is this: we're currently doing something by default
that's almost guaranteed to be the wrong thing to do. There's a
reasonably simple alternative that's almost always the *right* thing
to do. So let's switch the default behaviour to get the common case
right, and leave the confusing errors for the situations where
something is actually *broken* (i.e. misplaced __init__.py files and
scripts in package directories that are relying on implicit relative
imports).

And if that means requiring that package directories always be marked
explicitly (either by an __init__.py file or by a ".pyp" extension)
and forever abandoning the concepts in PEP 402, so be it.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From pje at telecommunity.com  Thu Nov 17 06:48:52 2011
From: pje at telecommunity.com (PJ Eby)
Date: Thu, 17 Nov 2011 00:48:52 -0500
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CADiSq7dLK7xjFOJubGgvJaXGxxY-5z8Cobchf7jNqNKrSo1zfA@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
	<CADiSq7dLK7xjFOJubGgvJaXGxxY-5z8Cobchf7jNqNKrSo1zfA@mail.gmail.com>
Message-ID: <CALeMXf6vup1BYk5XK8nPJ7LnM8n0Mu+X909JpnsxxLfxgeBpAw@mail.gmail.com>

On Wed, Nov 16, 2011 at 8:52 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Umm, if people screw up their filesystem layouts and *lie* to the
> interpreter about whether or not something is a package, how is that
> our fault? "Oh, they told me something, but they might not mean it, so
> I'll choose to ignore the information they've given me" is the part
> that sounds like guessing to me.
>

Er, what?

They're not lying, they just made a mistake -- a mistake that could've
occurred at any point during a project's development, which would then only
surface later.

As I said, I've seen projects where people had unnecessary __init__.py
files floating around -- mainly because at some point they were trying any
and everything to get package imports to work correctly, and somewhere
along the line decided to just put __init__.py files everywhere just to be
"sure" that things would work.  (i.e. the sort of behavior PEP 402 is
supposed to make unnecessary.)


If we error *immediately*, telling them what's wrong with their
> filesystem, that's the *opposite* of guessing.
>

I'm all in favor of warning or erroring out on aliasing __main__ or
relative imports from __main__.  It's silently *succeeding* in doing
something that might not have been intended on the basis of coincidental
__init__.py placement that I have an issue with.

There exist projects that *intentionally* alias their modules as both a
package and non-package (*cough* PIL *cough*), to name just *one* kind of
*intentionally* weird sys.path setups, not counting unintentional ones like
I mentioned.  The simple fact is that you cannot unambiguously determine
the intended meaning of a given script, and you certainly can't do it
*before* the script executes (because it might already be doing some
sys.path munging of its own.

Saying that people who made one kind of mistake or intentional change are
lying, while a different set of people making mistakes deserve to have
their mistake silently corrected doesn't seem to make much sense to me.
 But even if I granted that people with extra __init__.py's floating around
should be punished for this (and I don't), this *still* wouldn't magically
remove the existing ambiguity-of-intention in today's Python projects.
 Without some way for people to explicitly declare their intention (e.g.
explicitly setting __qname__), you really have no way to definitely
establish what the user's *intention* is.  (Especially since the user who
wrote the code and the user trying to use it might be different people....
 and sys.path might've been set up by yet another party.)

IOW, it's ambiguous already, today, with or without 382, 402, or any other
new PEP.  (Heck, it was ambiguous before PEP 302 came around!)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111117/9145b51b/attachment-0001.html>

From ncoghlan at gmail.com  Thu Nov 17 08:00:00 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Nov 2011 17:00:00 +1000
Subject: [Import-SIG] PEP 395 (Module aliasing) and the namespace PEPs
In-Reply-To: <CALeMXf6vup1BYk5XK8nPJ7LnM8n0Mu+X909JpnsxxLfxgeBpAw@mail.gmail.com>
References: <CADiSq7f+eiPv58_9wg41-Fx7GU2x1gvesOrZnbZLPqj1dFgc0A@mail.gmail.com>
	<CALeMXf56ypVUEy0FFj1WsXXfDhq0Y+0sp0gCP-_QQ3WNvh3xqw@mail.gmail.com>
	<CADiSq7cx7UEQ_OiHZEUKuJioAkeGLpNoCiMPiJOQxB64CNVvBA@mail.gmail.com>
	<CALeMXf4tCa5uvJQM3x=s3D13L3vTAfdrvuWeVkok1Lq8Gaqu1A@mail.gmail.com>
	<CADiSq7dLK7xjFOJubGgvJaXGxxY-5z8Cobchf7jNqNKrSo1zfA@mail.gmail.com>
	<CALeMXf6vup1BYk5XK8nPJ7LnM8n0Mu+X909JpnsxxLfxgeBpAw@mail.gmail.com>
Message-ID: <CADiSq7e=AFSmz_SJSJAJzEWF9PQQXKerj6WnTMU8Z_nUz+-Eew@mail.gmail.com>

On Thu, Nov 17, 2011 at 3:48 PM, PJ Eby <pje at telecommunity.com> wrote:
> I'm all in favor of warning or erroring out on aliasing __main__ or relative
> imports from __main__. ?It's silently *succeeding* in doing something that
> might not have been intended on the basis of coincidental __init__.py
> placement that I have an issue with.

This is the part I don't get - you say potentially unintentional
success is bad, but you're ok with silently succeeding by *ignoring*
the presence of an __init__.py file and hence performing implicit
relative imports, exactly the behaviour that PEP 328 set out to
eliminate.

Currently, by default, a *correct* package layout breaks under direct
execution. I am proposing that we make it work by preventing implicit
relative imports from __main__, just as we do from any other module.

As a consequence, scripts that already support direct execution from
inside a package would need to be updated to use explicit relative
imports in Python 3.3+, since their implicit relative imports will
break, just as they already do when you attempt to import such a
module. I'm happy to fix things for novices and put the burden of a
workaround on the people that know what they're doing.

The workaround:

    if __name__ == "__main__" and sys.version_info < (3, 3):
        import peer # Implicit relative import
    else:
        from . import peer # explicit relative import

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Sat Nov 19 13:59:24 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 19 Nov 2011 22:59:24 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
Message-ID: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>

The updated version is includes below and has also been updated on
python.org if you prefer a nicely formatted version:
http://www.python.org/dev/peps/pep-0395/

The recent discussion regarding imports from main really crystallised
for me what I think is currently wrong with imports from main modules
- I was cheering when the Django folks updated their default site
template to avoid putting a package directory on sys.path (due to all
the problems it causes), but that thread made me realise how easy we
make it for beginners to do that by accident, with no real payoff of
any kind to justify it.

So the PEP now spends a lot of time talking about the fact that our
current system for initialising sys.path[0] is almost always just
plain wrong as soon as packages are involved, but the explicit markers
on package directories make it possible for us to do the right thing
instead of being dumb about it.

Cheers,
Nick.

-----------------------------------------------
PEP: 395
Title: Qualifed Names for Modules
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-Mar-2011
Python-Version: 3.3
Post-History: 5-Mar-2011, 19-Nov-2011


Abstract
========

This PEP proposes new mechanisms that eliminate some longstanding traps for
the unwary when dealing with Python's import system, as well as serialisation
and introspection of functions and classes.

It builds on the "Qualified Name" concept defined in PEP 3155.


Relationship with Other PEPs
----------------------------

This PEP builds on the "qualified name" concept introduced by PEP 3155, and
also shares in that PEP's aim of fixing some ugly corner cases when dealing
with serialisation of arbitrary functions and classes.

It also builds on PEP 366, which took initial tentative steps towards making
explicit relative imports from the main module work correctly in at least
*some* circumstances.

This PEP is also affected by the two competing "namespace package" PEPs
(PEP 382 and PEP 402). This PEP would require some minor adjustments to
accommodate PEP 382, but has some critical incompatibilities with respect to
the implicit namespace package mechanism proposed in PEP 402.

Finally, PEP 328 eliminated implicit relative imports from imported modules.
This PEP proposes that implicit relative imports from main modules also be
eliminated.


What's in a ``__name__``?
=========================

Over time, a module's ``__name__`` attribute has come to be used to handle a
number of different tasks.

The key use cases identified for this module attribute are:

1. Flagging the main module in a program, using the ``if __name__ ==
   "__main__":`` convention.
2. As the starting point for relative imports
3. To identify the location of function and class definitions within the
   running application
4. To identify the location of classes for serialisation into pickle objects
   which may be shared with other interpreter instances


Traps for the Unwary
====================

The overloading of the semantics of ``__name__`` have resulted in several
traps for the unwary. These traps can be quite annoying in practice, as
they are highly unobvious and can cause quite confusing behaviour. A lot of
the time, you won't even notice them, which just makes them all the more
surprising when they do come up.


Why are my imports broken?
--------------------------

There's a general principle that applies when modifying ``sys.path``: *never*
put a package directory directly on ``sys.path``. The reason this is
problematic is that every module in that directory is now potentially
accessible under two different names: as a top level module (since the
package directory is on ``sys.path``) and as a submodule of the package (if
the higher level directory containing the package itself is also on
``sys.path``).

As an example, Django (up to and including version 1.3) is guilty of setting
up exactly this situation for site-specific applications - the application
ends up being accessible as both ``app`` and ``site.app`` in the module
namespace, and these are actually two *different* copies of the module. This
is a recipe for confusion if there is any meaningful mutable module level
state, so this behaviour is being eliminated from the default site set up in
version 1.4 (site-specific apps will always be fully qualified with the site
name).

However, it's hard to blame Django for this, when the same part of Python
responsible for setting ``__name__ = "__main__"`` in the main module commits
the exact same error when determining the value for ``sys.path[0]``.

The impact of this can be seen relatively frequently if you follow the
"python" and "import" tags on Stack Overflow. When I had the time to follow
it myself, I regularly encountered people struggling to understand the
behaviour of straightforward package layouts like the following::

    project/
        setup.py
        package/
            __init__.py
            foo.py
            tests/
                __init__.py
                test_foo.py

I would actually often see it without the ``__init__.py`` files first, but
that's a trivial fix to explain. What's hard to explain is that all of the
following ways to invoke ``test_foo.py`` *probably won't work* due to broken
imports (either failing to find ``package`` for absolute imports, complaining
about relative imports in a non-package for explicit relative imports, or
issuing even more obscure errors if some other submodule happens to shadow
the name of a top-level module, such as a ``package.json`` module that
handled serialisation or a ``package.tests.unittest`` test runner)::

    # working directory: project/package/tests
    ./test_foo.py
    python test_foo.py
    python -m test_foo
    python -c "from test_foo import main; main()"

    # working directory: project/package
    tests/test_foo.py
    python tests/test_foo.py
    python -m tests.test_foo
    python -c "from tests.test_foo import main; main()"

    # working directory: project
    package/tests/test_foo.py
    python package/tests/test_foo.py

    # working directory: project/..
    project/package/tests/test_foo.py
    python project/package/tests/test_foo.py
    # The -m and -c approaches don't work from here either, but the failure
    # to find 'package' correctly is pretty easy to explain in this case

That's right, that long list is of all the methods of invocation that will
almost certainly *break* if you try them, and the error messages won't make
any sense if you're not already intimately familiar not only with the way
Python's import system works, but also with how it gets initialised.

For a long time, the only way to get ``sys.path`` right with that kind of
setup was to either set it manually in ``test_foo.py`` itself (hardly
something a novice, or even many veteran, Python programmers are going to
know how to do) or else to make sure to import the module instead of
executing it directly::

    # working directory: project
    python -c "from package.tests.test_foo import main; main()"

Since the implementation of PEP 366 (which defined a mechanism that allows
relative imports to work correctly when a module inside a package is executed
via the ``-m`` switch), the following also works properly::

    # working directory: project
    python -m package.tests.test_foo

The fact that most methods of invoking Python code from the command line
break when that code is inside a package, and the two that do work are highly
sensitive to the current working directory is all thoroughly confusing for a
beginner, and I personally believe it is one of the key factors leading
to the perception that Python packages are complicated and hard to get right.

This problem isn't even limited to the command line - if ``test_foo.py`` is
open in Idle and you attempt to run it by pressing F5, then it will fail in
just the same way it would if run directly from the command line.

There's a reason the general ``sys.path`` guideline mentioned above exists,
and the fact that the interpreter itself doesn't follow it when determining
``sys.path[0]`` is the root cause of all sorts of grief.


Importing the main module twice
-------------------------------

Another venerable trap is the issue of importing ``__main__`` twice. This
occurs when the main module is also imported under its real name, effectively
creating two instances of the same module under different names.

If the state stored in ``__main__`` is significant to the correct operation
of the program, or if there is top-level code in the main module that has
non-idempotent side effects, then this duplication can cause obscure and
surprising errors.


In a bit of a pickle
--------------------

Something many users may not realise is that the ``pickle`` module sometimes
relies on the ``__module__`` attribute when serialising instances of arbitrary
classes. So instances of classes defined in ``__main__`` are pickled that way,
and won't be unpickled correctly by another python instance that only imported
that module instead of running it directly. This behaviour is the underlying
reason for the advice from many Python veterans to do as little as possible
in the  ``__main__`` module in any application that involves any form of
object serialisation and persistence.

Similarly, when creating a pseudo-module (see next paragraph), pickles rely
on the name of the module where a class is actually defined, rather than the
officially documented location for that class in the module hierarchy.

For the purposes of this PEP, a "pseudo-module" is a package designed like
the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These
packages are documented as if they were single modules, but are in fact
internally implemented as a package. This is *supposed* to be an
implementation detail that users and other implementations don't need to
worry about, but, thanks to ``pickle`` (and serialisation in general),
the details are often exposed and can effectively become part of the public
API.

While this PEP focuses specifically on ``pickle`` as the principal
serialisation scheme in the standard library, this issue may also affect
other mechanisms that support serialisation of arbitrary class instances
and rely on ``__module__`` attributes to determine how to handle
deserialisation.


Where's the source?
-------------------

Some sophisticated users of the pseudo-module technique described
above recognise the problem with implementation details leaking out via the
``pickle`` module, and choose to address it by altering ``__name__`` to refer
to the public location for the module before defining any functions or classes
(or else by modifying the ``__module__`` attributes of those objects after
they have been defined).

This approach is effective at eliminating the leakage of information via
pickling, but comes at the cost of breaking introspection for functions and
classes (as their ``__module__`` attribute now points to the wrong place).


Forkless Windows
----------------

To get around the lack of ``os.fork`` on Windows, the ``multiprocessing``
module attempts to re-execute Python with the same main module, but skipping
over any code guarded by ``if __name__ == "__main__":`` checks. It does the
best it can with the information it has, but is forced to make assumptions
that simply aren't valid whenever the main module isn't an ordinary directly
executed script or top-level module. Packages and non-top-level modules
executed via the ``-m`` switch, as well as directly executed zipfiles or
directories, are likely to make multiprocessing on Windows do the wrong thing
(either quietly or noisily, depending on application details) when spawning a
new process.

While this issue currently only affects Windows directly, it also impacts
any proposals to provide Windows-style "clean process" invocation via the
multiprocessing module on other platforms.


Qualified Names for Modules
===========================

To make it feasible to fix these problems once and for all, it is proposed
to add a new module level attribute: ``__qualname__``. This abbreviation of
"qualified name" is taken from PEP 3155, where it is used to store the naming
path to a nested class or function definition relative to the top level
module.

For modules, ``__qualname__`` will normally be the same as ``__name__``, just
as it is for top-level functions and classes in PEP 3155. However, it will
differ in some situations so that the above problems can be addressed.

Specifically, whenever ``__name__`` is modified for some other purpose (such
as to denote the main module), then ``__qualname__`` will remain unchanged,
allowing code that needs it to access the original unmodified value.

If a module loader does not initialise ``__qualname__`` itself, then the
import system will add it automatically (setting it to the same value as
``__name__``).


Eliminating the Traps
=====================

The following changes are interrelated and make the most sense when
considered together. They collectively either completely eliminate the traps
for the unwary noted above, or else provide straightforward mechanisms for
dealing with them.

A rough draft of some of the concepts presented here was first posted on the
python-ideas list [1]_, but they have evolved considerably since first being
discussed in that thread. Further discussion has subsequently taken place on
the import-sig mailing list [2]_.


Fixing main module imports inside packages
------------------------------------------

To eliminate this trap, it is proposed that an additional filesystem check be
performed when determining a suitable value for ``sys.path[0]``. This check
will look for Python's explicit package directory markers and use them to find
the appropriate directory to add to ``sys.path``.

The current algorithm for setting ``sys.path[0]`` in relevant cases is roughly
as follows::

    # Interactive prompt, -m switch, -c switch
    sys.path.insert(0, '')

::

    # Valid sys.path entry execution (i.e. directory and zip execution)
    sys.path.insert(0, sys.argv[0])

::

    # Direct script execution
    sys.path.insert(0, os.path.dirname(sys.argv[0]))

It is proposed that this initialisation process be modified to take
package details stored on the filesystem into account::

    # Interactive prompt, -c switch
    in_package, path_entry, modname = split_path_module(os.getcwd(), '')
    if in_package:
        sys.path.insert(0, path_entry)
    else:
        sys.path.insert(0, '')
    # Start interactive prompt or run -c command as usual
    # __main__.__qualname__ is set to "__main__"

::

    # -m switch
    modname = <<argument to -m switch>>
    in_package, path_entry, modname = split_path_module(os.getcwd(), modname)
    if in_package:
        sys.path.insert(0, path_entry)
    else:
        sys.path.insert(0, '')
    # modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
    # __main__.__qualname__ is set to modname

::

    # Valid sys.path entry execution (i.e. directory and zip execution)
    modname = "__main__"
    path_entry, modname = split_path_module(sys.argv[0], modname)
    sys.path.insert(0, path_entry)
    # modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
    # __main__.__qualname__ is set to modname

::

    # Direct script execution
    in_package, path_entry, modname = split_path_module(sys.argv[0])
    sys.path.insert(0, path_entry)
    if in_package:
        # Pass modname to ``runpy._run_module_as_main()``
    else:
        # Run script directly
    # __main__.__qualname__ is set to modname

The ``split_path_module()`` supporting function used in the above pseudo-code
would have the following semantics::

    def _splitmodname(fspath):
        path_entry, fname = os.path.split(fspath)
        modname = os.path.splitext(fname)[0]
        return path_entry, modname

    def _is_package_dir(fspath):
        return any(os.exists("__init__" + info[0]) for info
                       in imp.get_suffixes())

    def split_path_module(fspath, modname=None):
        """Given a filesystem path and a relative module name, determine an
           appropriate sys.path entry and a fully qualified module name.

           Returns a 3-tuple of (package_depth, fspath, modname). A reported
           package depth of 0 indicates that this would be a top level import.

           If no relative module name is given, it is derived from the final
           component in the supplied path with the extension stripped.
        """
        if modname is None:
            fspath, modname = _splitmodname(fspath)
        package_depth = 0
        while _is_package_dir(fspath):
            fspath, pkg = _splitmodname(fspath)
            modname = pkg + '.' + modname
        return package_depth, fspath, modname

This PEP also proposes that the ``split_path_module()`` functionality be
exposed directly to Python users via the ``runpy`` module.


Compatibility with PEP 382
~~~~~~~~~~~~~~~~~~~~~~~~~~

Making this proposal compatible with the PEP 382 namespace packaging PEP is
trivial. The semantics of ``_is_package_dir()`` are merely changed to be::

    def _is_package_dir(fspath):
        return (fspath.endswith(".pyp") or
                any(os.exists("__init__" + info[0]) for info
                        in imp.get_suffixes()))


Incompatibility with PEP 402
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

PEP 402 proposes the elimination of explicit markers in the file system for
Python packages. This fundamentally breaks the proposed concept of being able
to take a filesystem path and a Python module name and work out an unambiguous
mapping to the Python module namespace. Instead, the appropriate mapping
would depend on the current values in ``sys.path``, rendering it impossible
to ever fix the problems described above with the calculation of
``sys.path[0]`` when the interpreter is initialised.

While some aspects of this PEP could probably be salvaged if PEP 402 were
adopted, the core concept of making import semantics from main and other
modules more consistent would no longer be feasible.

This incompatibility is discussed in more detail in the relevant import-sig
thread [2]_.


Potential incompatibilities with scripts stored in packages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The proposed change to ``sys.path[0]`` initialisation *may* break some
existing code. Specifically, it will break scripts stored in package
directories that rely on the implicit relative imports from ``__main__`` in
order to run correctly under Python 3.

While such scripts could be imported in Python 2 (due to implicit relative
imports) it is already the case that they cannot be imported in Python 3,
as implicit relative imports are no longer permitted when a module is
imported.

By disallowing implicit relatives imports from the main module as well,
such modules won't even work as scripts with this PEP. Switching them
over to explicit relative imports will then get them working again as
both executable scripts *and* as importable modules.

To support earlier versions of Python, a script could be written to use
different forms of import based on the Python version::

    if __name__ == "__main__" and sys.version_info < (3, 3):
        import peer # Implicit relative import
    else:
        from . import peer # explicit relative import


Fixing dual imports of the main module
--------------------------------------

Given the above proposal to get ``__qualname__`` consistently set correctly
in the main module, one simple change is proposed to eliminate the problem
of dual imports of the main module: the addition of a ``sys.metapath`` hook
that detects attempts to import ``__main__`` under its real name and returns
the original main module instead::

  class AliasImporter:
    def __init__(self, module, alias):
        self.module = module
        self.alias = alias

    def __repr__(self):
        fmt = "{0.__class__.__name__}({0.module.__name__}, {0.alias})"
        return fmt.format(self)

    def find_module(self, fullname, path=None):
        if path is None and fullname == self.alias:
            return self
        return None

    def load_module(self, fullname):
        if fullname != self.alias:
            raise ImportError("{!r} cannot load {!r}".format(self, fullname))
        return self.main_module

This metapath hook would be added automatically during import system
initialisation based on the following logic::

    main = sys.modules["__main__"]
    if main.__name__ != main.__qualname__:
        sys.metapath.append(AliasImporter(main, main.__qualname__))

This is probably the least important proposal in the PEP - it just
closes off the last mechanism that is likely to lead to module duplication
after the configuration of ``sys.path[0]`` at interpreter startup is
addressed.


Fixing pickling without breaking introspection
----------------------------------------------

To fix this problem, it is proposed to make use of the new module level
``__qualname__`` attributes to determine the real module location when
``__name__`` has been modified for any reason.

In the main module, ``__qualname__`` will automatically be set to the main
module's "real" name (as described above) by the interpreter.

Pseudo-modules that adjust ``__name__`` to point to the public namespace will
leave ``__qualname__`` untouched, so the implementation location remains readily
accessible for introspection.

If ``__name__`` is adjusted at the top of a module, then this will
automatically adjust the ``__module__`` attribute for all functions and
classes subsequently defined in that module.

Since multiple submodules may be set to use the same "public" namespace,
functions and classes will be given a new ``__qualmodule__`` attribute
that refers to the ``__qualname__`` of their module.

This isn't strictly necessary for functions (you could find out their
module's qualified name by looking in their globals dictionary), but it is
needed for classes, since they don't hold a reference to the globals of
their defining module. Once a new attribute is added to classes, it is
more convenient to keep the API consistent and add a new attribute to
functions as well.

These changes mean that adjusting ``__name__`` (and, either directly or
indirectly, the corresponding function and class ``__module__`` attributes)
becomes the officially sanctioned way to implement a namespace as a package,
while exposing the API as if it were still a single module.

All serialisation code that currently uses ``__name__`` and ``__module__``
attributes will then avoid exposing implementation details by default.

To correctly handle serialisation of items from the main module, the class
and function definition logic will be updated to also use ``__qualname__``
for the ``__module__`` attribute in the case where ``__name__ == "__main__"``.

With ``__name__`` and ``__module__`` being officially blessed as being used
for the *public* names of things, the introspection tools in the standard
library will be updated to use ``__qualname__`` and ``__qualmodule__``
where appropriate. For example:

- ``pydoc`` will report both public and qualified names for modules
- ``inspect.getsource()`` (and similar tools) will use the qualified names
  that point to the implementation of the code
- additional ``pydoc`` and/or ``inspect`` APIs may be provided that report
  all modules with a given public ``__name__``.


Fixing multiprocessing on Windows
---------------------------------

With ``__qualname__`` now available to tell ``multiprocessing`` the real
name of the main module, it will be able to simply include it in the
serialised information passed to the child process, eliminating the
need for the current dubious introspection of the ``__file__`` attribute.

For older Python versions, ``multiprocessing`` could be improved by applying
the ``split_path_module()`` algorithm described above when attempting to
work out how to execute the main module based on its ``__file__`` attribute.


Explicit relative imports
=========================

This PEP proposes that ``__package__`` be unconditionally defined in the
main module as ``__qualname__.rpartition('.')[0]``. Aside from that, it
proposes that the behaviour of explicit relative imports be left alone.

In particular, if ``__package__`` is not set in a module when an explicit
relative import occurs, the automatically cached value  will continue to be
derived from ``__name__`` rather than ``__qualname__``. This minimises any
backwards incompatibilities with existing code that deliberately manipulates
relative imports by adjusting ``__name__`` rather than setting ``__package__``
directly.


Reference Implementation
========================

None as yet.


References
==========

.. [1] Module aliases and/or "real names"
   (http://mail.python.org/pipermail/python-ideas/2011-January/008983.html)

.. [2] PEP 395 (Module aliasing) and the namespace PEPs
   (http://mail.python.org/pipermail/import-sig/2011-November/000382.html)



Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu Nov 24 00:05:53 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2011 09:05:53 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
Message-ID: <CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>

On Sat, Nov 19, 2011 at 10:59 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The updated version is includes below and has also been updated on
> python.org if you prefer a nicely formatted version:
> http://www.python.org/dev/peps/pep-0395/
>
> The recent discussion regarding imports from main really crystallised
> for me what I think is currently wrong with imports from main modules
> - I was cheering when the Django folks updated their default site
> template to avoid putting a package directory on sys.path (due to all
> the problems it causes), but that thread made me realise how easy we
> make it for beginners to do that by accident, with no real payoff of
> any kind to justify it.
>
> So the PEP now spends a lot of time talking about the fact that our
> current system for initialising sys.path[0] is almost always just
> plain wrong as soon as packages are involved, but the explicit markers
> on package directories make it possible for us to do the right thing
> instead of being dumb about it.

*crickets*

No feedback at all on the prospect of changing the way we initialise
sys.path[0] to respect the package information available on the
filesystem?

Also, ?ric Araujo raised an interesting point [1], automatically
initialising sys.path[0] *at all* can be a problem in some
circumstances, especially when symlinks are involved. PEP 395 won't
really help with that (it may change some of the symptoms, but it
won't fix the general problem), but it does make me wonder if the
interpreter should have a flag to switch off sys.path[0]
initialisation (similar to the existing flags to disable site
processing, user site processing and processing of the PYTHONHOME and
PYTHONPATH environment variables).

[1] http://bugs.python.org/issue10318 (last couple of comments)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From pje at telecommunity.com  Thu Nov 24 02:18:44 2011
From: pje at telecommunity.com (PJ Eby)
Date: Wed, 23 Nov 2011 20:18:44 -0500
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
Message-ID: <CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>

On Wed, Nov 23, 2011 at 6:05 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> No feedback at all on the prospect of changing the way we initialise
> sys.path[0] to respect the package information available on the
> filesystem?
>

I gave you feedback previously: I think guessing based on __init__ files
introduces new breakage potential at a distance for things that didn't
break before.  It'll also guess the wrong location when somebody bundles a
dependency inside their package, and you try to run a script from the
embedded package.

I've tried coming up with other ways to guess the right thing to do, but
fundamentally, they're all just guessing.

What if we instead had something like this:

    import sys
    sys.set_script_module('foo.bar', __name__)

And what it did was, if __name__ is '__main__', and sys.path[0] is pointing
to the parent directory of the script file, then it fixes sys.path[0] to
point to the right parent directory level.  (Sanity checking whether you
can then find the __main__ module using the given module name and the
resulting sys.path[0].)

Is it ugly?  Yes.  But it's *explicit*, and provides One Obvious Way to
make a script that's also a module and will work correctly even if it's
part of a package that's been embedded inside another package.

I think that this or some other form of explicit declaration is needed to
get around __init__ ambiguities that exist in the field today.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111123/2d86a4eb/attachment.html>

From ncoghlan at gmail.com  Thu Nov 24 05:24:52 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2011 14:24:52 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
Message-ID: <CADiSq7dC-qMMQrVAQPA1WW5P-8JzckYUy5W+MOKqdHgD4QC3tg@mail.gmail.com>

On Thu, Nov 24, 2011 at 11:18 AM, PJ Eby <pje at telecommunity.com> wrote:
>
> On Wed, Nov 23, 2011 at 6:05 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> No feedback at all on the prospect of changing the way we initialise
>> sys.path[0] to respect the package information available on the
>> filesystem?
>
> I gave you feedback previously: I think guessing based on __init__ files
> introduces new breakage potential at a distance for things that didn't break
> before. ?It'll also guess the wrong location when somebody bundles a
> dependency inside their package, and you try to run a script from the
> embedded package.

And you have yet to explain how that is in any way inferior to the
status quo where we are consistently doing something that we *know* is
wrong (i.e. putting a package directory directly on sys.path).

Assuming people have their package layouts correct is *not* guessing,
no matter how many times you try to claim it is. Calling that guessing
is like saying that module name shadowing on sys.path (or any form of
name shadowing) is guessing. It may not be what the user intended, but
that doesn't mean the interpreter is wrong to believe the information
the user is providing.

The status quo sucks - as soon as you put a python file inside a
package, almost *every* method we offer to invoke it breaks. Direct
command line invocation breaks, double-clicking in a file browser
breaks, running from Idle breaks, even importing it or using the -m
switch only work if you're in the right working directory. All it
takes is one perfectly reasonable assumption (that the filesystem
package layout is correct), and we can *fix* all that just by being a
bit smarter about the way we figure out sys.path[0].

Hypothetical "oh, but this bizarre situation with a clearly broken
package layout that only worked by accident might now start breaking
when it worked before" scenarios are a lousy argument for not fixing
the behaviour of the interpreter for the vast majority of people that
are doing the right thing.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu Nov 24 05:37:30 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2011 14:37:30 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
Message-ID: <CADiSq7f10tG_Rikh-ANi2QyBAavV1UFXvnfhVea04CCY+6tEmw@mail.gmail.com>

On Thu, Nov 24, 2011 at 11:18 AM, PJ Eby <pje at telecommunity.com> wrote:
> It'll also guess the wrong location when somebody bundles a
> dependency inside their package, and you try to run a script from the
> embedded package.

Oops, meant to reply to this part specifically.

There are two legitimate ways of bundling a dependency inside a
package: either embedding it in your module namespace, or by shipping
a private directory (no __init__.py file) that you place on sys.path.

In both cases, PEP 395 will do the right thing.

In the first case, the parent directory of the embedding package ends
up in sys.path[0] and the embedded copy is accessible as
"package.embedded" (for example). The embedded copy *should* be using
explicit relative imports (if it isn't, it's not safe to embed a copy
as part of your module namespace in the first place). The explicit
relative imports will all refer to the embedded copy as they should,
and everything is fine.

In the second case, the private directory will get placed in
sys.path[0], the embedded copy is accessible at the top-level as
"embedded" and everything is, once again, fine.

You have yet to identify any case where a script will break for a
reason other than reliance on implicit relative imports inside a
package (which are *supposed* to be dead in 3.x, but linger in
__main__ solely due to the way we initialise sys.path[0]). If a script
is going to be legitimately shipped inside a package directory, it
*must* be importable as part of that package namespace, and any script
in Py3k that relies on implicit relative imports fails to qualify.
This is in contrast to 2.x, where the implicit relative import support
in all package submodules let you get away with that kind of approach.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From pje at telecommunity.com  Thu Nov 24 06:32:52 2011
From: pje at telecommunity.com (PJ Eby)
Date: Thu, 24 Nov 2011 00:32:52 -0500
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7f10tG_Rikh-ANi2QyBAavV1UFXvnfhVea04CCY+6tEmw@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
	<CADiSq7f10tG_Rikh-ANi2QyBAavV1UFXvnfhVea04CCY+6tEmw@mail.gmail.com>
Message-ID: <CALeMXf66HjD9FjG1qgfavxLLtgZSig3o4-aoArS_GYv3or3A1g@mail.gmail.com>

On Wed, Nov 23, 2011 at 11:37 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> You have yet to identify any case where a script will break for a
> reason other than reliance on implicit relative imports inside a
> package


You're right; I didn't think of this because I haven't moved past Python
2.5 for production coding as yet.  ;-)

I still think extraneous __init__.py files still exist in the field, but
I'll admit that both of these things are infrequent cases.

However, if we're going on the basis of how many newbie errors can be
solved by Just Working, PEP 402 will help more newbies than PEP 395, since
you must first *have* a package in order for 395 to be meaningful.  ;-)

(which are *supposed* to be dead in 3.x, but linger in
> __main__ solely due to the way we initialise sys.path[0]). If a script
> is going to be legitimately shipped inside a package directory, it
> *must* be importable as part of that package namespace, and any script
> in Py3k that relies on implicit relative imports fails to qualify.
>

Wait a minute...  What would happen if there were no implicit relative
imports allowed in __main__?

Or are you just saying that you get the *appearance* of implicit relative
importing, due to aliasing?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20111124/ff098a4d/attachment-0001.html>

From ncoghlan at gmail.com  Thu Nov 24 09:12:43 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2011 18:12:43 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CALeMXf66HjD9FjG1qgfavxLLtgZSig3o4-aoArS_GYv3or3A1g@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALeMXf5uXcXYm92kdnrgphg8Z4u5ouEBZE-N2A+E=2EqaPF3CA@mail.gmail.com>
	<CADiSq7f10tG_Rikh-ANi2QyBAavV1UFXvnfhVea04CCY+6tEmw@mail.gmail.com>
	<CALeMXf66HjD9FjG1qgfavxLLtgZSig3o4-aoArS_GYv3or3A1g@mail.gmail.com>
Message-ID: <CADiSq7edyER9KWzoneusXiQNqcs_kw7uJ6E1K=EGqDSQOrAAHw@mail.gmail.com>

On Thu, Nov 24, 2011 at 3:32 PM, PJ Eby <pje at telecommunity.com> wrote:
> You're right; I didn't think of this because I haven't moved past Python 2.5 for production coding as yet.  ;-)

Yeah, there's absolutely no way we could have changed this in 2.x -
with implicit relative imports in packages still allowed, there's too
much code such a change in semantics could have broken.

In Py3k though, most of that code is already going to break one way or
another: if they don't change it, attempting to import it will fail
(since implicit relative imports are gone), while if they *do* switch
to explicit relative imports to make importing as a module work, then
they're probably going to break direct invocation (since __name__ and
__package__ will be wrong unless you use '-m' from the correct working
directory).

The idea behind PEP 395 is to make converting to explicit relative
imports the right thing to do, *without* breaking dual-role modules
for either use case.

> However, if we're going on the basis of how many newbie errors can be solved
> by Just Working, PEP 402 will help more newbies than PEP 395, since you must
> first *have* a package in order for 395 to be meaningful. ?;-)

Nope, PEP 402 makes it worse, because it permanently entrenches the
current broken sys.path[0] initialisation with no apparent way out.
That first list in the current PEP of "these invocations currently
break for modules inside packages"? They all *stay* broken forever
under PEP 402, because the filesystem no longer fully specifies the
package structure - you need an *already* initialised sys.path to
figure out how to translate a given filesystem layout into the Python
namespace. With the package structure underspecified, there's no way
to reverse engineer what sys.path[0] *should* be and it becomes
necessary to put the burden back on the developer.

Consider this PEP 382 layout (based on the example __init__.py based
layout I use in PEP 395):

project/
    setup.py
    example.pyp/
        foo.py
        tests.pyp/
            test_foo.py

There's no ambiguity there: We have a top level project directory
containing an "example" package fragment and an "example.tests"
subpackage fragment. Given the full path to any of "setup.py",
"foo.py" and "test_foo.py", we can figure out that the correct thing
to place in sys.path[0] is the "projects" directory.

Under PEP 402, it would look like this:

project/
    setup.py
    example/
        foo.py
        tests/
            test_foo.py

Depending on what you put on sys.path, that layout could be defining a
"project" package, an "example" package or a "tests" package. The
interpreter has no way of knowing, so it can't do anything sensible
with sys.path[0] when the only information it has is the filename for
"foo.py" or "test_foo.py". Your best bet would be the status quo: just
use the directory containing that file, breaking any explicit relative
imports in the process (since __name__ is correspondingly inaccurate).

People already have to understand that Python modules have to be
explicitly marked - while "foo" can be executed as a Python script, it
cannot be imported as a Python module. Instead, the name needs to be
"foo.py" so that the import process will recognise it as a source
module. Explaining that importable package directories are similarly
marked with either an "__init__.py" file or a ".pyp" extension is a
fairly painless task - people can accept it and move on, even if they
don't necessarily understand *why* it's useful to be explicit about
package layouts. (Drawing that parallel is even more apt these days,
given the ability to explicitly execute any directory containing a
__main__.py file regardless of the directory name or other contents)

The mismatch between __main__ imports and imports from everywhere
else, though? That's hard to explain to *experienced* Python
programmers, let alone beginners.

My theory is that if we can get package layouts to stop breaking most
invocation methods for modules inside those packages, then beginners
should be significantly less confused about how imports work because
the question simply won't arise. Once the behaviour of imports from
__main__ is made consistent with imports from other modules, then the
time when people need to *care* about details like how sys.path[0]
gets initialised can be postponed until much later in their
development as a Python programmer.

>> (which are *supposed* to be dead in 3.x, but linger in
>> __main__ solely due to the way we initialise sys.path[0]). If a script
>> is going to be legitimately shipped inside a package directory, it
>> *must* be importable as part of that package namespace, and any script
>> in Py3k that relies on implicit relative imports fails to qualify.
>
> Wait a minute... ?What would happen if there were no implicit relative
> imports allowed in __main__?
> Or are you just saying that you get the *appearance* of implicit relative
> importing, due to aliasing?

The latter - because the initialisation of sys.path[0] ignores package
structure information in the filesystem, it's easy to get the
interpreter to commit the cardinal aliasing sin of putting a package
directory on sys.path. In a lot of cases, that kinda sorta works in
2.x because of implicit relative imports, but it's always going to
cause problems in 3.x.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ericsnowcurrently at gmail.com  Thu Nov 24 11:38:45 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 24 Nov 2011 03:38:45 -0700
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
Message-ID: <CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>

On Wed, Nov 23, 2011 at 4:05 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Nov 19, 2011 at 10:59 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The updated version is includes below and has also been updated on
>> python.org if you prefer a nicely formatted version:
>> http://www.python.org/dev/peps/pep-0395/
>>
>> The recent discussion regarding imports from main really crystallised
>> for me what I think is currently wrong with imports from main modules
>> - I was cheering when the Django folks updated their default site
>> template to avoid putting a package directory on sys.path (due to all
>> the problems it causes), but that thread made me realise how easy we
>> make it for beginners to do that by accident, with no real payoff of
>> any kind to justify it.
>>
>> So the PEP now spends a lot of time talking about the fact that our
>> current system for initialising sys.path[0] is almost always just
>> plain wrong as soon as packages are involved, but the explicit markers
>> on package directories make it possible for us to do the right thing
>> instead of being dumb about it.
>
> *crickets*
>
> No feedback at all on the prospect of changing the way we initialise
> sys.path[0] to respect the package information available on the
> filesystem?
>
> Also, ?ric Araujo raised an interesting point [1], automatically
> initialising sys.path[0] *at all* can be a problem in some
> circumstances, especially when symlinks are involved. PEP 395 won't
> really help with that (it may change some of the symptoms, but it
> won't fix the general problem), but it does make me wonder if the
> interpreter should have a flag to switch off sys.path[0]
> initialisation (similar to the existing flags to disable site
> processing, user site processing and processing of the PYTHONHOME and
> PYTHONPATH environment variables).

That sounds good to me.

Relatedly, and this will reflect my relative inexperience here, I
still don't have a clear picture of why we do the sys.path[0]
initialization in the first place.  I'm sure there's a good reason,
but I've never (knowingly) met it.  :)

Is it so that modules on which the script relies don't have to be put
in a package under some sys.path directory (to be explicitly
imported), saving a little time during development or simplifying
packaging a little?  Seems to me like that sort of thing is
addressable in better, more obvious ways, but especially that the
benefits of the implicit initialization don't outweigh the problems it
causes (hence PEP 395).

Unless you think to read the sys.path doc entry, you won't know about
the sys.path[0] initialization for scripts, and wonder why you keep
getting implicit relative imports when you aren't expecting them.  I
would think that would be more impactful across the spectrum of Python
experience than (what seem like minor) benefits gained from the
implicit initialization.

So, at this point, is it just an artifact of early Python, when better
solutions weren't around yet?  What am I missing here?

-eric

>
> [1] http://bugs.python.org/issue10318 (last couple of comments)
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>

From ncoghlan at gmail.com  Thu Nov 24 12:33:15 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2011 21:33:15 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>
Message-ID: <CADiSq7cin54WJhmwRz7PvpWDbAWEFOgFK7zx07+Xft=d+QbUeg@mail.gmail.com>

On Thu, Nov 24, 2011 at 8:38 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> So, at this point, is it just an artifact of early Python, when better
> solutions weren't around yet? ?What am I missing here?

sys.path[0] initialisation is essential for making the interactive
interpreter useful - when you're developing new Python code, you want
to be able to import whatever you're working on into an interactive
session without having to mess about with sys.path[0] manually.

The rest pretty much than follows from a desire to maintain some level
of consistency with the interactive prompt behaviour.

(I don't know Guido's original reasoning though - the current
behaviour was well established long before I started using the
language. As far as I know, it's been this way since the earliest
public releases)

Regardless, your question did just make me realise that my current
proposal for new -m semantics in PEP 395 is broken. It assumes that
the module is going to be found in a subdirectory of the current
directory and that's just plain wrong (e.g. for cases like "python -m
timeit"). However, fixing it is pretty easy, and addresses a slight
concern I had with what it allowed (i.e. the "python -m
tests.test_foo" and "python -m test_foo" cases).

The solution is simply to *not* adjust modname in the "-m" case (i.e.
keep the initialisation completely consistent with the interactive
prompt and -c, as it is now). Then the effect of PEP 395 will just be
to allow "python -m packages.tests.test_foo" to work from anywhere
within the package hierarchy, *without* allowing the abbreviated
forms. That's a much better, more consistent outcome.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ericsnowcurrently at gmail.com  Thu Nov 24 21:32:30 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 24 Nov 2011 13:32:30 -0700
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7cin54WJhmwRz7PvpWDbAWEFOgFK7zx07+Xft=d+QbUeg@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>
	<CADiSq7cin54WJhmwRz7PvpWDbAWEFOgFK7zx07+Xft=d+QbUeg@mail.gmail.com>
Message-ID: <CALFfu7CpkBD+pov3vpdG5y0BQtSF6v8wzuU93mAk3+fLDBJ2ww@mail.gmail.com>

On Thu, Nov 24, 2011 at 4:33 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Nov 24, 2011 at 8:38 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> So, at this point, is it just an artifact of early Python, when better
>> solutions weren't around yet? ?What am I missing here?
>
> sys.path[0] initialisation is essential for making the interactive
> interpreter useful - when you're developing new Python code, you want
> to be able to import whatever you're working on into an interactive
> session without having to mess about with sys.path[0] manually.

<faceplam />

> The rest pretty much than follows from a desire to maintain some level
> of consistency with the interactive prompt behaviour.

 So the behavior of Python execution can be grouped thusly:
  - scripts (<script>, -m, -, -c, interactive interpreter) [1]
  - modules (import,)
  - execs (exec, eval, execfile, input)

and we want the sys.path initialization to be consistent across all
script-like execution?

Either that's not the case, or the "-m" flag is in its own group, or
PEP 395 is trying to do something it shouldn't.  I'd argue it's the
first, and that sys,path initialization should apply only to the
interactive interpreter (and -c).  Is the consistency worth the
trouble?  Maybe so.  Maybe not.

Doesn't restricting the application of sys.path initialization make
PEP 395 simpler?

[1]  as indicated by __name__ == "__main__", for better or worse

> (I don't know Guido's original reasoning though - the current
> behaviour was well established long before I started using the
> language. As far as I know, it's been this way since the earliest
> public releases)

Maybe a premature optimization of sorts?

> Regardless, your question did just make me realise that my current
> proposal for new -m semantics in PEP 395 is broken.

Glad I could be of _some_ kind of help. <wink>

-eric

> It assumes that
> the module is going to be found in a subdirectory of the current
> directory and that's just plain wrong (e.g. for cases like "python -m
> timeit"). However, fixing it is pretty easy, and addresses a slight
> concern I had with what it allowed (i.e. the "python -m
> tests.test_foo" and "python -m test_foo" cases).
>
> The solution is simply to *not* adjust modname in the "-m" case (i.e.
> keep the initialisation completely consistent with the interactive
> prompt and -c, as it is now). Then the effect of PEP 395 will just be
> to allow "python -m packages.tests.test_foo" to work from anywhere
> within the package hierarchy, *without* allowing the abbreviated
> forms. That's a much better, more consistent outcome.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
>

From ncoghlan at gmail.com  Fri Nov 25 03:45:20 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Nov 2011 12:45:20 +1000
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CALFfu7CpkBD+pov3vpdG5y0BQtSF6v8wzuU93mAk3+fLDBJ2ww@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>
	<CADiSq7cin54WJhmwRz7PvpWDbAWEFOgFK7zx07+Xft=d+QbUeg@mail.gmail.com>
	<CALFfu7CpkBD+pov3vpdG5y0BQtSF6v8wzuU93mAk3+fLDBJ2ww@mail.gmail.com>
Message-ID: <CADiSq7cO2XgKOYap8jkrjvdwcfgFpY4o0wtiMEojNSVKznKT_Q@mail.gmail.com>

On Fri, Nov 25, 2011 at 6:32 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> ?So the behavior of Python execution can be grouped thusly:
> ?- scripts (<script>, -m, -, -c, interactive interpreter) [1]
> ?- modules (import,)
> ?- execs (exec, eval, execfile, input)

Yeah, that's basically accurate, although I'd describe it as:

- the main module (includes sys.path[0] initialisation, sys.argv
initialisation, messes with "__name__", may not be importable
- imported modules (must meet the import system's definition of a
"valid module", which can be altered by various import hooks)
- source execution (exec/eval/literal_eval)

> and we want the sys.path initialization to be consistent across all
> script-like execution?

Being useful is more important than being consistent per se. With
zipfiles and directory execution, for example, their __main__.py file
gets executed, but sys.path[0] is set to the zipfile or directory.

> Either that's not the case, or the "-m" flag is in its own group, or
> PEP 395 is trying to do something it shouldn't. ?I'd argue it's the
> first, and that sys,path initialization should apply only to the
> interactive interpreter (and -c). ?Is the consistency worth the
> trouble? ?Maybe so. ?Maybe not.

-m has to do *something* with sys.path or it doesn't work for files
you're working on in the current directory (unless that directory is
already on sys.path). For example, I mostly leave my working directory
sitting at the "src" directory in my checkout, then use '-m' to run
various things from within the Python package in that directory.

> Doesn't restricting the application of sys.path initialization make
> PEP 395 simpler?

Nope, it would just make it a backwards compatibility nightmare. It's
already skating on thin ice in a couple of places by having to make
the argument that it's taking things that are technically *already*
broken (often in somewhat obscure ways) and turning them into explicit
exceptions that should make the underlying problem clearer :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ericsnowcurrently at gmail.com  Fri Nov 25 08:19:02 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 25 Nov 2011 00:19:02 -0700
Subject: [Import-SIG] Updated PEP 395 ("Qualified Names for Modules" aka
 "Implicit Relative Imports Must Die!")
In-Reply-To: <CADiSq7cO2XgKOYap8jkrjvdwcfgFpY4o0wtiMEojNSVKznKT_Q@mail.gmail.com>
References: <CADiSq7f0j+Da+P-YdeG9qf5r2BPHKrBG8UFprTnQkp84VWc3_g@mail.gmail.com>
	<CADiSq7dvPBmAMc8RTObebKPcXc7twyX-BcJx1DtkCNf1bw4Gnw@mail.gmail.com>
	<CALFfu7AAbLnheNeNQvZqOisBUsDd1uiS8GGDRouZter4F_87nw@mail.gmail.com>
	<CADiSq7cin54WJhmwRz7PvpWDbAWEFOgFK7zx07+Xft=d+QbUeg@mail.gmail.com>
	<CALFfu7CpkBD+pov3vpdG5y0BQtSF6v8wzuU93mAk3+fLDBJ2ww@mail.gmail.com>
	<CADiSq7cO2XgKOYap8jkrjvdwcfgFpY4o0wtiMEojNSVKznKT_Q@mail.gmail.com>
Message-ID: <CALFfu7B8G8tA-1cFqt_cf6tSHej3j2qwgoA5NVqB73RUzMOJ8g@mail.gmail.com>

On Thu, Nov 24, 2011 at 7:45 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Nov 25, 2011 at 6:32 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> ?So the behavior of Python execution can be grouped thusly:
>> ?- scripts (<script>, -m, -, -c, interactive interpreter) [1]
>> ?- modules (import,)
>> ?- execs (exec, eval, execfile, input)
>
> Yeah, that's basically accurate, although I'd describe it as:
>
> - the main module (includes sys.path[0] initialisation, sys.argv
> initialisation, messes with "__name__", may not be importable
> - imported modules (must meet the import system's definition of a
> "valid module", which can be altered by various import hooks)
> - source execution (exec/eval/literal_eval)

That's good.  I'm putting together a comprehensive  reference on
imports in Python and plan on having a section that emphasizes the
difference between scripts and modules.  I think a lot of people (not
just beginners) don't recognize it.  This discussion is helping with
that section.  :)

>
>> and we want the sys.path initialization to be consistent across all
>> script-like execution?
>
> Being useful is more important than being consistent per se. With
> zipfiles and directory execution, for example, their __main__.py file
> gets executed, but sys.path[0] is set to the zipfile or directory.
>
>> Either that's not the case, or the "-m" flag is in its own group, or
>> PEP 395 is trying to do something it shouldn't. ?I'd argue it's the
>> first, and that sys,path initialization should apply only to the
>> interactive interpreter (and -c). ?Is the consistency worth the
>> trouble? ?Maybe so. ?Maybe not.
>
> -m has to do *something* with sys.path or it doesn't work for files
> you're working on in the current directory (unless that directory is
> already on sys.path). For example, I mostly leave my working directory
> sitting at the "src" directory in my checkout, then use '-m' to run
> various things from within the Python package in that directory.

Yeah, and I like where you're going with Issue13475.  I still have a
concern with the implicit "-p .", but I'll take it up on the tracker.

>
>> Doesn't restricting the application of sys.path initialization make
>> PEP 395 simpler?
>
> Nope, it would just make it a backwards compatibility nightmare. It's
> already skating on thin ice in a couple of places by having to make
> the argument that it's taking things that are technically *already*
> broken (often in somewhat obscure ways) and turning them into explicit
> exceptions that should make the underlying problem clearer :)

Well, if you factor in a -p flag, is it as big a deal?

-eric

>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
>