[Python-Dev] Draft PEP: "Simplified Package Layout and Partitioning"
P.J. Eby
pje at telecommunity.com
Thu Jul 21 01:03:58 CEST 2011
At 03:09 PM 7/20/2011 -0700, Glenn Linderman wrote:
>On 7/20/2011 6:05 AM, P.J. Eby wrote:
>>At 02:24 AM 7/20/2011 -0700, Glenn Linderman wrote:
>>>When I read about creating __path__ from sys.path, I immediately
>>>thought of the issue of programs that extend sys.path, and the
>>>above is the "workaround" for such programs. but it requires
>>>such programs to do work, and there are a lot of such programs (I,
>>>a relative newbie, have had to write some). As it turns out, I
>>>can't think of a situation where I have extended sys.path that
>>>would result in a problem for fancy namespace packages, because so
>>>far I've only written modules, not packages, and only modules are
>>>on the paths that I add to sys.path. But that does not make
>>>for a general solution.
>>
>>Most programs extend sys.path in order to import things. If those
>>things aren't yet imported, they don't have a __path__ yet, and so
>>don't need to be fixed. Only programs that modify sys.path
>>*after* importing something that has a dynamic __path__ would need
>>to do anything about that.
>
>Sure. But there are a lot of things already imported by Python
>itself, and if this mechanism gets used in the stdlib, a program
>wouldn't know whether it is safe or not, to not bother with the
>pkgutil.extend_virtual_paths() call or not.
I'm not sure I see how the mechanism could meaningfully be used in
the stdlib, since IIUC we're not going for Perl-style package
naming. So, all stdlib packages would be self-contained.
>Plus, that requires importing pkgutil, which isn't necessarily done
>by every program that extends the sys.path ("import sys" is
>sufficient at present).
>
>Plus, if some 3rd party packages are imported before sys.path is
>extended, the knowledge of how they are implement is required to
>make a choice about whether it is needed to import pkgutil and call
>extend_virtual_paths or not.
I'd recommend *always* using it, outside of simple startup code.
>So I am still left with my original question:
>
>>>Is there some way to create a new __path__ that would reflect the
>>>fact that it has been dynamically created, rather than set from
>>>__init__.py, and then when it is referenced, calculate (and
>>>cache?) a new value of __path__ to actually search?
Hm. Yes, there is a way to do something like that, but it would
complicate things a bit. We'd need to:
1. Leave __path__ off of the modules, and always pull them from
sys.virtual_package_paths, and
2. Before using a value in sys.virtual_package_paths, we'd need to
check whether sys.path had changed since we last cached anything, and
if so, clear sys.virtual_package_paths first, to force a refresh.
This doesn't sound particularly forbidding, but there are various
unpleasant consequences, like being unable to tell whether a module
is a package or not, and whether it's a virtual package or not. We'd
have to invent new ways to denote these things.
On the bright side, though, it *would* allow transparent live updates
to virtual package paths, so it might be worth considering.
By the way, the reason we have to get rid of __path__ is that if we
kept it, then code could change it, and then we wouldn't know if it
was actually safe to change it automatically... even if no code had
actually changed it.
In principle, we could keep __path__ attributes around, and
automatically update them in the case where sys.path has changed, so
long as user code hasn't directly altered or replaced the
__path__. But it seems to me to be a dangerous corner case; I'd
rather that code which touches __path__ be taking responsibility for
that path's correctness from then on, rather than having it get
updated (possibly incorrectly) behind its back.
So, I'd say that for this approach, we'd have to actually leave
__path__ off of virtual packages' parent modules.
Anyway, it seems worth considering. We just need to sort out what
the downsides are for any current tools thinking that such modules
aren't packages. (But hey, at least it'll be consistent with what
such tools would think of the on-disk representation! That is, a
tool that thinks foo.py alongside a foo/ subdirectory is just a
module with no package, will also think that 'foo', once imported, is
a module with no package.)
>And, in the absence of knowing (because I didn't write them) whether
>any of the packages I imported before extending sys.path are virtual
>packages or not, I would have to do this every time I extend
>sys.path. And so it becomes a burden on writing programs.
>
>If the code is so boilerplate as you describe, should sys.path
>become an object that acts like a list, instead of a list, and have
>its append method automatically do the pkgutil.extend_virtual_paths
>for me? Then I wouldn't have to worry about whether any of the
>packages I imported were virtual packages or not.
Well, then we'd have to worry about other mutation methods, and
things like 'sys.path = [blah, blah]', as well. So if we're going to
ditch the need for extend_virtual_paths(), we should probably do it
via the absence of __path__ attributes.
More information about the Python-Dev
mailing list