[Python-ideas] My objections to implicit package directories

Nick Coghlan ncoghlan at gmail.com
Tue Mar 13 07:07:46 CET 2012


On Tue, Mar 13, 2012 at 10:03 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 2. Implicit package directories pose awkward backwards compatibility challenges
>
> It concerns me gravely that the consensus proposal MvL posted is
> *backwards incompatible with Python 3.2*, as it deliberately omits one
> of the PEP 402 features that provided that backwards compatibility.
> Specifically, under the consensus, a subdirectory "foo" of a directory
> on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears
> later on sys.path. As Python 3.2 would have found that latter
> module/package correctly, this is an unacceptable breach of the
> backwards compatibility requirements. PEP 402 at least got this right
> by always executing the first "foo.py" or "foo/__init__.py" it found,
> even if
> another "foo" directory was found earlier in sys.path.
>
> We can't just wave that additional complexity away if an implicit
> package directory proposal is going to remain backwards compatible
> with current layouts (e.g. if an application's starting directory
> included a "json" subfolder containing json files rather than Python
> code, the consensus approach as posted by MvL would render the
> standard library's json module inaccessible)

It has been pointed out that the above is based on a misreading of
MvL's email. So, consider the following backwards compatibility
concern instead:

Many projects use the following snippet to find a json module:

    try:
        import json
    except ImportError:
        import simplejson as json

Now, this particular snippet should still work fine with implicit
package directories (even if a non-Python json directory exists on
sys.path), since there *will* be a real json module in the standard
library to find and the simplejson fallback won't be needed.

However, for the general case:

    try:
        import foo
    except ImportError:
        import foobar as foo

Then implicit package directories pose a backwards compatibility
problem (specifically, if "foo" does not exist as a module or explicit
package on sys.path, but there is a non-Python "foo/" directory, then
"foo" will be silently be created as an empty package rather than
falling back to "foobar").

Sure, the likelihood of that actually affecting anyone is fairly
remote (although all it really takes is one broken uninstaller leaving
a "foo" dir in site-packages), but we've rejected proposals in the
past over smaller concerns than this.

*Now*, my original comment about the consensus view rejecting
complexity from PEP 402 by disregarding backwards compatibility
concerns becomes accurate. PEP 402 addressed this issue specifically
by disallowing direct imports of implicit packages (only finding them
later when searching for submodules). This is in fact the motivating
case given for that behaviour in the PEP:
http://www.python.org/dev/peps/pep-0402/#backwards-compatibility-and-performance

So, *why* are we adopting implicit packages again, given all the
challenges they pose? What, exactly, is the problem with a ".pyp"
extension that makes all this additional complexity the preferred
choice?

So far, I've only heard two *positive* statements in favour of
implicit package directories:

1. Java/Perl/etc do it that way.

I've already made it clear that I don't care about that argument. If
it was all that compelling, we'd have implicit self by now. (However,
clearly Guido favours it in this case, given his message that arrived
while I was writing this one)

2. It (arguably) makes it easier to convert an existing package into a
namespace package

With implicit package directories, you just delete your empty
__init__.py file to turn an existing package into a namespace package.
With a PEP 382 style directory suffix, you have to change your
directory name to append the ".pyp" (and, optionally, delete your
__init__.py file, since it's now going to be ignored anyway).

Barry's also tried to convince me that ".pyp" directories are somehow
harder for distributions to deal with, but his only example looked
like trying to use "yield from" in Python 3.2 and then complaining
when it didn't work.

However, so long as the backwards compatibility from PEP 402 is
incorporated, and the new PEP proposed a specific addition to the
tutorial to document the "never CD into a package, never double-click
a file in a package to execute it, always use -m to execute modules
from inside packages" guideline (and makes it clear that you may get
strange and unpredictable behaviour if you ever break it), then I can
learn to live with it. IDLE should also be updated to allow correct
execution of submodules via F5 (I guess it will need some mechanism to
be told what working directories to add to sys.path).

It still seems to me that moving to a marker *suffix* (rather than a
marker file) as PEP 382 proposes brings all the real practical
benefits of implicit package directories (i.e. no empty __init__.py
files wasting space) and absolutely *none* of the pain (i.e. no
backwards compatibility concerns, no ambiguity in the filesystem to
module hierarchy mapping, still able to fix direct execution of
modules inside packages rather than having to explain forevermore why
it doesn't work), but Guido clearly feels otherwise.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list