PyWart: "Python's import statement and the history of external dependencies"
rosuav at gmail.com
Sat Nov 22 14:00:34 CET 2014
On Sat, Nov 22, 2014 at 11:25 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Ian Kelly wrote:
> - It's hard to keep track of what modules are in the standard library. Which
> of the following is *not* in Python 3.3's std lib? No cheating by looking
> them up.)
> os2emxpath, wave, sndheader, statslib, poplist, plist,
> pickletools, picklelib, path, cgi, cgitb, copylib, xpath
Okay, here's my guesses.
os2emxpath: In the stdlib, but more often accessed as "os.path" while
running under OS/2
wave: Not in the stdlib, though I'd avoid the name anyway.
sndheader: Not in the stdlib - probably on PyPI though
poplist, plist, pickletools, picklelib: I suspect PyPI, not stdlib,
but could be wrong
path: Not in the stdlib (there's os.path and I doubt there'd be both)
cgi, cgitb: In the stdlib
copylib: No idea, could be either way.
xpath: I'll guess this as not being present.
I'm probably pretty wrong, though.
>>> # Contrary to popular belief, sys.path is *NOT* a module, #
>>> # no, it's a global! #
>> I really doubt that this is a popular belief.
> I'm not aware of anyone who believes that sys.path is a module.
> But yes, sys.path is not just global, but process-wide global. *All* modules
> share the same sys.path.
Even leaving aside Rick's sloppy language, I still doubt that it's
popular belief that sys.path be module-specific. You're modifying
something in a different module, and Python's always maintained that
two instances of "import sys" will give two references to the exact
same module object.
> That would be horrible. But here's an alternative which is less horrible and
> maybe even useful.
> There's still a single process-wide search path, but there's a second
> per-module search path which is searched first. By default it's empty.
> So a module can define it's own extra search path:
> __path__ = ['look/here', 'and/here']
> import something
> without affecting any other modules.
That's what Rick said first, and then said that if you're going to be
explicit, you should do the job properly and not have _any_ implicit
Thing is, though, it still breaks the sys.modules concept. Either
__path__ is ignored if the module was found in sys.modules, or it's
possible to have multiple entries with the same name (which would make
it hard to have a module replace itself in sys.modules, currently a
supported thing). Although I suppose all it'd require is that
sys.modules be keyed by __file__ rather than __name__, so they're
identified by fully qualified path and file name. (What does that do
in the face of .pyc files?)
>> And after all that, it would still fail if you happened to want to
>> import both "calendar" modules into the same module.
> __path__ = 
> import calendar
> __path__ = ['my/python/modules']
> import calendar as mycalendar
Frankly, if you actually want this, I think it's time to turn to an
uglier-but-more-flexible method.like poking around in importlib. (I'm
not sure off-hand how you'd go about it, it's not instantly obvious
from help(importlib).) I'm more concerned about the possibility of
your import succeeding or failing depending on the order of other
__path__ = ['my/python/modules']
How's that one to be resolved? That's what I don't like.
So long as sys.modules is (a) process-wide and (b) keyed by module
name rather than file name, sys.path MUST be process-wide too, and
MUST be set on startup, or as soon as possible afterwards. Any module
imported prior to altering sys.path will be fetched based on the
previous search path - and you have to import sys to change sys.path,
which means the minimum set of unalterable modules is, on Python 3.5:
rosuav at sikorsky:~$ cat showmods.py
rosuav at sikorsky:~$ python3 showmods.py
__main__, _codecs, _collections_abc, _frozen_importlib, _imp, _io,
_signal, _sitebuiltins, _stat, _sysconfigdata, _thread, _warnings,
_weakref, _weakrefset, abc, builtins, codecs, encodings,
encodings.aliases, encodings.latin_1, encodings.utf_8, errno,
genericpath, io, marshal, os, os.path, posix, posixpath, site, stat,
sys, sysconfig, zipimport
... that's a decent lot of modules you can't fiddle with. Hence
PYTHONPATH, which presumably is processed by the interpreter prior to
loading any modules.
More information about the Python-list