[Python-Dev] PEP 402: Simplified Package Layout and Partitioning

Glyph glyph at twistedmatrix.com
Thu Dec 1 08:02:25 CET 2011


On Nov 30, 2011, at 6:39 PM, Nick Coghlan wrote:

> On Thu, Dec 1, 2011 at 1:28 AM, PJ Eby <pje at telecommunity.com> wrote:
>> It doesn't help at all that I'm not really in a position to provide an
>> implementation, and the persons most likely to implement have been leaning
>> somewhat towards 382, or wanting to modify 402 such that it uses .pyp
>> directory extensions so that PEP 395 can be supported...
> 
> While I was initially a fan of the possibilities of PEP 402, I
> eventually decided that we would be trading an easy problem ("you need
> an '__init__.py' marker file or a '.pyp' extension to get Python to
> recognise your package directory") for a hard one ("What's your
> sys.path look like? What did you mean for it to look like?"). Symlinks
> (and the fact we implicitly call realname() during system
> initialisation and import) just make things even messier.
> *Deliberately* allowing package structures on the filesystem to become
> ambiguous is a recipe for future pain (and could potentially undo a
> lot of the good work done by PEP 328's elimination of implicit
> relative imports).
> 
> I acknowledge there is a lot of confusion amongst novices as to how
> packages and imports actually work, but my diagnosis of the root cause
> of that problem is completely different from that supposed by PEP 402
> (as documented in the more recent versions of PEP 395, I've come to
> believe it is due to the way we stuff up the default sys.path[0]
> initialisation when packages are involved).
> 
> So, in the end, I've come to strongly prefer the PEP 382 approach. The
> principle of "Explicit is better than implicit" applies to package
> detection on the filesystem just as much as it does to any other kind
> of API design, and it really isn't that different from the way we
> treat actual Python files (i.e. you can *execute* arbitrary files, but
> they need to have an appropriate extension if you want to import
> them).

I've helped an almost distressing number of newbies overcome their confusion about sys.path and packages.  Systems using Twisted are, almost by definition, hairy integration problems, and are frequently being created or maintained by people with little to no previous Python experience.

Given that experience, I completely agree with everything you've written above (except for the part where you initially liked it).  I appreciate the insight that PEP 402 offers about python's package mechanism (and the difficulties introduced by namespace packages).  Its statement of the problem is good, but in my opinion its solution points in exactly the wrong direction: packages need to be _more_ explicit about their package-ness and tools need to be stricter about how they're laid out.  It would be great if sys.path[0] were actually correct when running a script inside a package, or at least issued a warning which would explain how to correctly lay out said package.  I would love to see a loud alarm every time a module accidentally got imported by the same name twice.  I wish I knew, once and for all, whether it was 'import Image' or 'from PIL import Image'.

My hope is that if Python starts to tighten these things up a bit, or at least communicate better about best practices, editors and IDEs will develop better automatic discovery features and frameworks will start to normalize their sys.path setups and stop depending on accidents of current directory and script location.  This will in turn vastly decrease confusion among new python developers taking on large projects with a bunch of libraries, who mostly don't care what the rules for where files are supposed to go are, and just want to put them somewhere that works.

-glyph


More information about the Python-Dev mailing list