[Distutils] nspkg.pth files break $PYTHONPATH overrides
Donald Stufft
donald at stufft.io
Mon Mar 24 22:53:52 CET 2014
See also https://github.com/pypa/pip/issues/3
Basically prior to PEP420 namespace packages were bad and using them
results in pain sooner or later :( I’m not sure if a good solution yet, perhaps
we can backport PEP420 to PyPI and have namespace packages depend
on that?
On Mar 24, 2014, at 5:48 PM, Barry Warsaw <barry at python.org> wrote:
> Apologies for cross-posting, but this intersects setuptools and the import
> system, and I wanted to be sure it reached the right audience.
>
> A colleague asked me why a seemingly innocent and common use case for
> developing local versions of system installed packages wasn't working, and I
> was quite perplexed. As I dug into the problem, more questions than answers
> came up. I finally (think! I) figured out what is happening, but not so much
> as to why, or what can/should be done about it.
>
> This person had a local checkout of a package's source, where the package was
> also installed into the system Python. He wanted to be able to set
> $PYTHONPATH so that the local package wins when he tries to import it. E.g.:
>
> % PYTHONPATH=`pwd`/src python3
>
> but this didn't work because despite the setting of PYTHONPATH, the system
> version of the package was always found first. The package in question is
> lazr.uri, although other packages with similar layouts will also suffer the
> same problem, which prevents an easy local development of a newer version of
> the package, aside from being a complete head-scratcher.
>
> The lazr.uri package is intended to be a submodule of the lazr namespace
> package. As such, the lazr/__init__.py has the old style way of declaring a
> namespace package:
>
> try:
> import pkg_resources
> pkg_resources.declare_namespace(__name__)
> except ImportError:
> import pkgutil
> __path__ = pkgutil.extend_path(__path__, __name__)
>
> and its setup.py declares a namespace package:
>
> setup(
> name='lazr.uri',
> version=__version__,
> namespace_packages=['lazr'],
> ...
>
> One of the things that the Debian "helper" program does when it builds a
> package for the archive is call `$python setup.py install_egg_info`. It's
> this command that breaks $PYTHONPATH overriding.
>
> install_egg_info looks at the lazr.uri.egg-info/namespace_packages.txt file,
> in which it finds the string 'lazr', and it proceeds to write a
> lazr-uri-1.0.3-py3.4-nspkg.pth file. This causes other strange and unexpected
> things to happen:
>
> % python3
> Python 3.4.0 (default, Mar 22 2014, 22:51:25)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys
>>>> sys.modules['lazr']
> <module 'lazr'>
>>>> sys.modules['lazr'].__path__
> ['/usr/lib/python3/dist-packages/lazr']
>
> It's completely weird that sys.modules would contain a key for 'lazr' when
> that package was never explicitly imported. Even stranger, because a fake
> module object is stuffed into sys.modules via the .pth file, tracing imports
> with -v gives you no clue as to what's happening. And while
> sys.modules['lazr'] has an __path__, it has no other attributes.
>
> I really don't understand what the purpose of the nspkg.pth file is,
> especially for Python 3 namespace packages.
>
> Here's what the nspkg.pth file contains:
>
> import sys,types,os; p = os.path.join(sys._getframe(1).f_locals['sitedir'], *('lazr',)); ie = os.path.exists(os.path.join(p,'__init__.py')); m = not ie and sys.modules.setdefault('lazr',types.ModuleType('lazr')); mp = (m or []) and m.__dict__.setdefault('__path__',[]); (p not in mp) and mp.append(p)
>
> The __path__ value is important here because even though you've never
> explicitly imported 'lazr', when you *do* explicitly import 'lazr.uri', the
> existing lazr module object's __path__ takes over, and thus the system
> lazr.uri package is found even though both lazr/ and lazr/uri/ should have
> been found earlier on sys.path (yes, sys.path looks exactly as expected).
>
> So the presence of the nspkg.pth file breaks $PYTHONPATH overriding. That
> seems bad. ;)
>
> If you delete the nspkg.path file, then things work as expected, but even this
> is a little misleading!
>
> I think the Debian helper is running install_egg_info as a way to determine
> what namespace packages are defined, so that it can actually *remove* the
> parent's __init__.py file and use PEP 420 style namespace packages. In fact,
> in the Debian python3-lazr.uri binary package, you find no system
> lazr/__init__.py file. This is why removing the nspkg.pth file works.
>
> So I thought, why not conditionally define setup(..., namespace_packages) only
> for Python 2? This doesn't work because the Debian helper will see that no
> namespace packages are defined, and thus it will leave the original
> lazr/__init__.py file in place. This then breaks $PYTHONPATH overriding too
> because of __path__ extension of the pre-PEP 420 code only *appends* the local
> development path. IOW, the system import path is the first element of a
> 2-element list on lazr.__path__. While the local import path is the second
> element, in this case too the local import fails.
>
> It seems like what you want for Python 3 (and we're talking >= 3.2 here) is
> for there to be neither a nspkg.pth file, nor the lazr/__init__.py file, and
> let PEP 420 do it's thing. In fact if you set things up this way, $PYTHONPATH
> overriding works exactly as expected.
>
> Because I don't know why install_egg_info is installing the nspkg.pth file, I
> don't know which component needs to be changed:
>
> * Change setuptools install_egg_info command to not install an nspkg.pth file
> even for namespace_package declare packages, at least under Python 3.
> This behavior seems pretty nasty all by itself because it magically and
> untraceably installs stripped down module objects in sys.modules when
> Python first scans the import path.
>
> * Change the Debian helper to remove the nspkg.pth file, or not call
> install_egg_info *and* continue to remove <nspkg>/__init__.py in Python 3
> so as to take advantage of PEP 420. It's nice to know that PEP 420
> actually represents something sane. :)
>
> For added bonus, we have this additional oddity:
>
> % PYTHONPATH=`pwd`/src python3
> Python 3.4.0 (default, Mar 22 2014, 22:51:25)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys
>>>> sys.modules['lazr']
> <module 'lazr'>
>>>> sys.modules['lazr'].__path__
> ['/usr/lib/python3/dist-packages/lazr']
>>>> import lazr.uri
>>>> lazr.uri.__file__
> '/usr/lib/python3/dist-packages/lazr/uri/__init__.py'
>>>> sys.modules['lazr']
> <module 'lazr' from '/home/barry/projects/ubuntu/lazruri/trusty/src/lazr/__init__.py'>
>>>> sys.modules['lazr'].__path__
> ['/home/barry/projects/ubuntu/lazruri/trusty/src/lazr', '/usr/lib/python3/dist-packages/lazr']
>
>
> Notice how importing lazr.uri *replaces* sys.modules['lazr'] with the local
> development one, even though it still imports lazr.uri from the system path.
> I'm not exactly sure how this happens, but I've traced that to
> _LoaderBasics.exec_module()'s call of _call_with_frames_removed(), which
> exec's lazr.uri's code object into that module's __dict__. Nothing in
> lazr/uri/__init__.py should be doing that, afaict from both visual inspection
> of the code and disassembling the compiled code object.
>
> Hopefully I've explained the situation correctly and lucidly. Below I'll
> describe how to set up a reproducible environment on a Debian machine.
> Thoughts and comments are welcome!
>
> Cheers,
> -Barry
>
> % sudo apt-get install python3-lazr.uri
> % cd tmp
> % bzr branch lp:lazr.uri trunk
> % cd trunk
> % PYTHONPATH=`pwd`/src python3
> (Then try things at the Python prompt from above.)
> _______________________________________________
> Distutils-SIG maillist - Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20140324/25c7ced8/attachment.sig>
More information about the Distutils-SIG
mailing list