[Import-SIG] What if namespace imports weren't special?

Tue Jul 12 09:53:13 CEST 2011

On Tue, Jul 12, 2011 at 1:25 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 05:32 PM 7/11/2011 +1000, Nick Coghlan wrote:
>>
>> On Mon, Jul 11, 2011 at 5:04 PM, Eric Snow <ericsnowcurrently at gmail.com>
>> wrote:
>> > FWIW, I think the solution in the PEP is the clearest approach, if
>> > "partitioned by default" is not an option.  And if that and the other
>> > alternate solutions are not feasible, it would be nice to have them
>> > added to the "rejected" section because they are still reasonable
>> > ideas.  Still, it would be nice if we didn't have to add a new
>> > packageness indicator.
>>
>> The runtime performance impact kills "partitioned by default" (i.e. no
>> marker files needed to indicate partitioned packages).
>
> Actually, partitioned by default is the *best* performance option we have
> for implementing this PEP, because it only uses a stat rather than a
> listdir.  Backward compatibility is the thing that kills it.

By "partitioned by default" I meant the prospect of continuing to
search sys.path after finding the email (etc.) directory in the stdlib
zipfile. Slowing down everything in order to speed up a new feature
isn't a good trade-off.

>> As far as the specific suggestion of using a "marker directory"
>> instead of marker files goes, I don't really see the benefit (and
>> plenty of downsides). I put it in the same category as using a special
>> extension on the directory name (since that's what it is, only using
>> "/" as the separator instead of ".") and reject it for the same
>> reasons.
>
> What are the downsides, exactly?  Special extensions don't work with the
> distutils; a prefix does.  (I've tested it.)  Most tools that look for code
> can be given a prefix to look for the code, but not an extension.  It's
> *quite* a different proposition than specially-named directories --
> especially since only the package root is affected, not every subpackage
> directory.

>From the revised PEP draft [1] re. a directory suffix:

"""   The downsides, however, are also plentiful.  If a package starts
   its life as a normal package, it must be renamed when it becomes
   a namespace, with the implied consequences for revision control
   tools.

   Further, there is an immense body of existing code (including the
   distutils and many other packaging tools) that expect a package
   directory's name to be the same as the package name.  And porting
   existing Python 2.x namespace packages to Python 3 would require
   widespread directory renaming as well.

   In short, this approach would require a vastly larger number of
   changes to both the standard library and third-party code, for
   a tiny potential performance improvement and a small increase in
   clarity.  It was therefore rejected on "practicality vs. purity"
   grounds."""

[1] http://mail.python.org/pipermail/import-sig/2011-July/000213.html

There are plenty of practical objections to having to move files
around and rename directories in order to turn an ordinary package
into a partitioned package. Those objections are just as valid for the
subdirectory approach as they are for a directory suffix. Dropping a
marker file into the directory is simple by contrast.

As someone that uses a dir tree+file list view to manage my file
system, I also think the subdirectory approach would be absolutely
hideous to navigate and manage. It works for __pycache__ because I
don't care what's in those (most of the time) and they don't have any
subdirectories. But for the actual package source code? And
potentially nested for subpackages? Yuck. Awful UI design.

*ding* <--- lightbulb

However, the __pycache__ example did just trigger an idea that may
give us the best of both worlds.

1. We use a shared marker *directory* called __package__ to indicate
partitioned packages. The import system just does a stat check for
__init__.py and a __package__ subdir to see if a directory is a Python
package directory.

2. All the .pyp files go inside the __package__ subdir rather than
being placed directly in the same directory as the package source
code.

No os.listdir() calls, no need to move files around to create a
partitioned package, no cluttering of the main package directories
with *.pyp files and distro packaging utilities are quite happy with
the idea of multiple packages writing to the same directory.

Thoughts?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia