Does anyone understand what's going on with libpython on Linux?
So we found another variation between how different distros build
CPython [1], and I'm very confused.
Fedora (for example) turns out to work the way I naively expected:
taking py27 as our example, they have:
- libpython2.7.so.1.0 contains the actual python runtime
- /usr/bin/python2.7 is a tiny (~7 KiB) executable that links to
libpython2.7.so.1 to do the actual work; the main python package
depends on the libpython package
- python extension module packages depend on the libpython package,
and contain extension modules linked against libpython2.7.so.1
- python extension modules compiled locally get linked against
libpython2.7.so.1 by default
Debian/Ubuntu do things differently:
- libpython2.7.so.1.0 exists and contains the full python runtime, but
is not installed by default
- /usr/bin/python2.7 *also* contains a *second* copy of the full
python runtime; there is no dependency relationship between these, and
you don't even get libpython2.7.so.1.0 installed unless you explicitly
request it or it gets pulled in through some other dependency
- most python extension module packages do *not* depend on the
libpython2.7 package, and contain extension modules that are *not*
linked against libpython2.7.so.1.0 (but there are exceptions!)
- python extension modules compiled locally do *not* get linked
against libpython2.7.so.1 by default.
The only things that seem to link against libpython2.7.so.1.0 in debian are:
a) other packages that embed python (e.g. gnucash, paraview, perf, ...)
b) some minority of python packages (e.g. the PySide/QtOpenGL.so
module is one that I found that directly links to libpython2.7.so.1.0)
I guess that the reason this works is that according to ELF linking
rules, the symbols defined in the main executable, or in the
transitive closure of the libraries that the main executable is linked
to via DT_NEEDED entries, are all injected into the global scope of
any dlopen'ed libraries.
Uh, let me try saying that again.
When you dlopen() a library -- like, for example, a python extension
module -- then the extension automatically gets access to any symbols
that are exported from either (a) the main executable itself, or (b)
any of the libraries that are listed if you run 'ldd <the main
executable>'. It also gets access to any symbols that are exported by
itself, or any of the libraries listed if you run 'ldd
What are Debian/Ubuntu doing in distutils so that extensions don't link to libpython by default?
I don't know exactly, but one way to reproduce this is simply to build the
interpreter without `--enable-shared`.
I don't know that their reasons are, but I presume that the Debian
maintainers have a well-considered reason for this design.
The PEP 513 text currently says that it's permissible for manylinux1 wheels
to link against libpythonX.Y.so. So presumably for a platform to be
manylinux1-compatible, libpythonX.Y.so should be available. I guess my
preference would be for pip to simply check as to whether or not
libpythonX.Y.so is available in its platform detection code
(pypa/pip/pull/3446).
Because Debian/Ubuntu is such a big target, instead of just bailing out and
forcing the user to install the sdist from PyPI (which is going to fail,
because Debian installations that lack libpythonX.Y.so also lack Python.h),
I would be +1 for adding some kind of message for this case that says,
"maybe you should `sudo apt-get install python-dev` to get these fancy new
wheels rolling."
-Robert
On Sun, Feb 7, 2016 at 12:01 AM, Nathaniel Smith
So we found another variation between how different distros build CPython [1], and I'm very confused.
Fedora (for example) turns out to work the way I naively expected: taking py27 as our example, they have: - libpython2.7.so.1.0 contains the actual python runtime - /usr/bin/python2.7 is a tiny (~7 KiB) executable that links to libpython2.7.so.1 to do the actual work; the main python package depends on the libpython package - python extension module packages depend on the libpython package, and contain extension modules linked against libpython2.7.so.1 - python extension modules compiled locally get linked against libpython2.7.so.1 by default
Debian/Ubuntu do things differently: - libpython2.7.so.1.0 exists and contains the full python runtime, but is not installed by default - /usr/bin/python2.7 *also* contains a *second* copy of the full python runtime; there is no dependency relationship between these, and you don't even get libpython2.7.so.1.0 installed unless you explicitly request it or it gets pulled in through some other dependency - most python extension module packages do *not* depend on the libpython2.7 package, and contain extension modules that are *not* linked against libpython2.7.so.1.0 (but there are exceptions!) - python extension modules compiled locally do *not* get linked against libpython2.7.so.1 by default.
The only things that seem to link against libpython2.7.so.1.0 in debian are: a) other packages that embed python (e.g. gnucash, paraview, perf, ...) b) some minority of python packages (e.g. the PySide/QtOpenGL.so module is one that I found that directly links to libpython2.7.so.1.0)
I guess that the reason this works is that according to ELF linking rules, the symbols defined in the main executable, or in the transitive closure of the libraries that the main executable is linked to via DT_NEEDED entries, are all injected into the global scope of any dlopen'ed libraries.
Uh, let me try saying that again.
When you dlopen() a library -- like, for example, a python extension module -- then the extension automatically gets access to any symbols that are exported from either (a) the main executable itself, or (b) any of the libraries that are listed if you run 'ldd <the main executable>'. It also gets access to any symbols that are exported by itself, or any of the libraries listed if you run 'ldd
'. OTOH it does *not* get access to any symbols exported by other libraries that get dlopen'ed -- each dlopen specifically creates its own "scope". So the reason this works is that Debian's /usr/bin/python2.7 itself exports all the standard Python C ABI symbols, so any extension module that it loads automatically get access to the CPython ABI, even if they don't explicitly link to it. And programs like gnucash are linked directly to libpython2.7.so.1, so they also end up exporting the CPython ABI to any libraries that they dlopen.
But, it seems to me that there are two problems with the Debian/Ubuntu way of doing things: 1) it's rather wasteful of space, since there are two complete independent copies of the whole CPython runtime (one inside /usr/bin/python2.7, the other inside libpython2.7.so.1). 2) if I ever embed cpython by doing dlopen("libpython2.7.so.1"), or dlopen("some_plugin_library_linked_to_libpython.so"), then the embedded cpython will not be able to load python extensions that are compiled in the Debian-style (but will be able to load python extensions compiled in the Fedora-style), because the dlopen() the loaded the python runtime and the dlopen() that loads the extension module create two different scopes that can't see each other's symbols. [I'm pretty sure this is right, but linking is arcane and probably I should write some tests to double check.]
I guess (2) might be why some of Debian's extension modules do link to libpython2.7.so.1 directly? Or maybe that's just a bug?
Is there any positive reason in favor of the Debian style approach? Clearly someone put some work into setting things up this way, so there must be some motivation, but I'm not sure what it is?
The immediate problem for us is that if a manylinux1 wheel links to libpythonX.Y.so (Fedora-style), and then it gets run on a Debian system that doesn't have libpythonX.Y.so installed, it will crash with:
ImportError: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
Maybe this is okay and the solution is to tell people that they need to 'apt install libpython2.7'. In a sense this isn't even a regression, because every system that is capable of installing a binary extension from an sdist has python2.7-dev installed, which depends on libpython2.7 --> therefore every system that used to be able to do 'pip install somebinary' with sdists will still be able to do it with manylinux1 builds.
The alternative is to declare that manylinux1 extensions should not link to libpython. This should I believe work fine on both Debian-style and Fedora-style installations -- though the PySide example, and the theoretical issue with embedding python through dlopen, both give me some pause.
Two more questions: - What are Debian/Ubuntu doing in distutils so that extensions don't link to libpython by default? If we do go with the option of saying that manylinux extensions shouldn't link to libpython, then that's something auditwheel *can* fix up, but it'd be even nicer if we could set up the docker image to get it right in the first place.
- Can/should Debian/Ubuntu switch to the Fedora model? Obviously it would take quite some time before a generic platform like manylinux could assume that this had happened, but it does seem better to me...? And if it's going to happen at all it might be nice to get the switch into 16.04 LTS? Of course that's probably ambitious, even if I'm not missing some reason why the Debian/Ubuntu model is actually advantageous.
-n
[1] https://github.com/pypa/manylinux/issues/30
-- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- -Robert
On Sun, 7 Feb 2016 00:25:57 -0800
"Robert T. McGibbon"
What are Debian/Ubuntu doing in distutils so that extensions don't link to libpython by default?
I don't know exactly, but one way to reproduce this is simply to build the interpreter without `--enable-shared`.
See https://bugs.python.org/issue21536. It would be nice if you could lobby for this issue to be resolved... (though that would only be for 3.6, presumably)
I don't know that their reasons are, but I presume that the Debian maintainers have a well-considered reason for this design.
Actually, shared library builds can be noticeably slower. I did measurements some time ago, and the results are: - shared builds are 5-10% slower on x86 - they can be up to 30% slower on some ARM CPUs! (this is on pystone which is a very crude benchmark, but in this case I think the pattern is more general, since any function call internal to Python is affected by the difference in code generation: shared library builds add an indirection overhead when resolving non-static symbols) Note btw. that Anaconda builds are also shared library builds. Regards Antoine.
On 07.02.2016 13:38, Antoine Pitrou wrote:
On Sun, 7 Feb 2016 00:25:57 -0800 "Robert T. McGibbon"
wrote: What are Debian/Ubuntu doing in distutils so that extensions don't link to libpython by default?
I don't know exactly, but one way to reproduce this is simply to build the interpreter without `--enable-shared`.
See https://bugs.python.org/issue21536. It would be nice if you could lobby for this issue to be resolved... (though that would only be for 3.6, presumably)
I don't know that their reasons are, but I presume that the Debian maintainers have a well-considered reason for this design.
Actually, shared library builds can be noticeably slower. I did measurements some time ago, and the results are: - shared builds are 5-10% slower on x86 - they can be up to 30% slower on some ARM CPUs!
yes, that's the reason why the python executable is built statically against the libpython library. Matthias
On Sun, Feb 7, 2016 at 4:38 AM, Antoine Pitrou
On Sun, 7 Feb 2016 00:25:57 -0800 "Robert T. McGibbon"
wrote: What are Debian/Ubuntu doing in distutils so that extensions don't link to libpython by default?
I don't know exactly, but one way to reproduce this is simply to build the interpreter without `--enable-shared`.
See https://bugs.python.org/issue21536. It would be nice if you could lobby for this issue to be resolved... (though that would only be for 3.6, presumably)
Just to unpack from that issue - and quoting a nice summary by you (Antoine): "... the -l flag was added in #832799, for a rather complicated case where the interpreter is linked with a library dlopened by an embedding application (I suppose for some kind of plugin system)." Following the link to https://bugs.python.org/issue832799 - the `-l` flag (and therefore the dependency on libpython was added at Python 2.3 for the case where an executable A dlopens a library B.so . B.so has an embedded Python interpreter and is linked to libpython. However, when the embedded Python interpreter in B.so loads an extension module mymodule.so , mymodule.so does not inherit a namespace with the libpython symbols already loaded. See https://bugs.python.org/msg18810 . One option we have then is to remove all DT_NEEDED references to libpython in manylinux wheels. We get instant compatibility for bare Debian / Ubuntu Python installs, at the cost of causing some puzzling crash for the case of: dlopened library with embedded Python interpreter where the embedded Python interpreter imports a manylinux wheel. On the other hand, presumably this same crash will occur for nearly all Debian-packaged Python extension modules (if it is true that they do not specify a libpython dependency) - so it seems unlikely that this is a common problem. Cheers, Matthew
One option we have then is to remove all DT_NEEDED references to libpython in manylinux wheels. We get instant compatibility for bare Debian / Ubuntu Python installs, at the cost of causing some puzzling crash for the case of: dlopened library with embedded Python interpreter where the embedded Python interpreter imports a manylinux wheel.
On the other hand, presumably this same crash will occur for nearly all Debian-packaged Python extension modules (if it is true that they do not specify a libpython dependency) - so it seems unlikely that
I don't think this is acceptable, since it's going to break some packages
that depend
on dlopen.
this is a common problem.
I don't think so. Debian-packaged extensions that require libpython to exist
(a minority of them to be sure, but ones that use complex shared library
layouts)
just declare a dependency on libpython. For example, python-pyside has a
Depends on libpython2.7:
```
$ apt-cache depends python-pyside.qtcore
python-pyside.qtcore
Depends: libc6
Depends: libgcc1
Depends: libpyside1.2
Depends: libpython2.7
Depends: libqtcore4
Depends: libshiboken1.2v5
Depends: libstdc++6
Depends: python
Depends: python
Conflicts: python-pyside.qtcore:i386
```
-Robert
On Sun, Feb 7, 2016 at 2:06 PM, Matthew Brett
On Sun, 7 Feb 2016 00:25:57 -0800 "Robert T. McGibbon"
wrote: What are Debian/Ubuntu doing in distutils so that extensions don't
to libpython by default?
I don't know exactly, but one way to reproduce this is simply to build
On Sun, Feb 7, 2016 at 4:38 AM, Antoine Pitrou
wrote: link the interpreter without `--enable-shared`.
See https://bugs.python.org/issue21536. It would be nice if you could lobby for this issue to be resolved... (though that would only be for 3.6, presumably)
Just to unpack from that issue - and quoting a nice summary by you (Antoine):
"... the -l flag was added in #832799, for a rather complicated case where the interpreter is linked with a library dlopened by an embedding application (I suppose for some kind of plugin system)."
Following the link to https://bugs.python.org/issue832799 - the `-l` flag (and therefore the dependency on libpython was added at Python 2.3 for the case where an executable A dlopens a library B.so . B.so has an embedded Python interpreter and is linked to libpython. However, when the embedded Python interpreter in B.so loads an extension module mymodule.so , mymodule.so does not inherit a namespace with the libpython symbols already loaded. See https://bugs.python.org/msg18810 .
One option we have then is to remove all DT_NEEDED references to libpython in manylinux wheels. We get instant compatibility for bare Debian / Ubuntu Python installs, at the cost of causing some puzzling crash for the case of: dlopened library with embedded Python interpreter where the embedded Python interpreter imports a manylinux wheel.
On the other hand, presumably this same crash will occur for nearly all Debian-packaged Python extension modules (if it is true that they do not specify a libpython dependency) - so it seems unlikely that this is a common problem.
Cheers,
Matthew _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- -Robert
On Sun, Feb 7, 2016 at 2:19 PM, Robert T. McGibbon
One option we have then is to remove all DT_NEEDED references to libpython in manylinux wheels. We get instant compatibility for bare Debian / Ubuntu Python installs, at the cost of causing some puzzling crash for the case of: dlopened library with embedded Python interpreter where the embedded Python interpreter imports a manylinux wheel.
I don't think this is acceptable, since it's going to break some packages that depend on dlopen.
On the other hand, presumably this same crash will occur for nearly all Debian-packaged Python extension modules (if it is true that they do not specify a libpython dependency) - so it seems unlikely that this is a common problem.
I don't think so. Debian-packaged extensions that require libpython to exist (a minority of them to be sure, but ones that use complex shared library layouts) just declare a dependency on libpython. For example, python-pyside has a Depends on libpython2.7:
``` $ apt-cache depends python-pyside.qtcore python-pyside.qtcore Depends: libc6 Depends: libgcc1 Depends: libpyside1.2 Depends: libpython2.7 Depends: libqtcore4 Depends: libshiboken1.2v5 Depends: libstdc++6 Depends: python Depends: python Conflicts: python-pyside.qtcore:i386 ```
Sure - and this might be because the pyside packager was being especially careful about libpython, or it might be an accident - pyside is hard to build. On the other hand, it looks like almost all the common Debian packages don't declare this dependency - so almost all of the standard scientific Python stack and more would crash in this corner case: apt-cache depends python-numpy | grep libpython apt-cache depends python-scipy | grep libpython apt-cache depends python-yaml | grep libpython apt-cache depends python-regex | grep libpython apt-cache depends python-matplotlib | grep libpython It seems reasonable to build to the same compatibility level as most Debian packaged modules. Matthew
On 8 February 2016 at 08:33, Matthew Brett
It seems reasonable to build to the same compatibility level as most Debian packaged modules.
Right, one of the key things to remember with manylinux1 is that it is, *quite deliberately*, only an 80% solution to the cross-distro lack-of-ABI-compatibility problem: we want to solve the simple cases now, and then move on to figuring out how to solve the more complex cases later (and applications that embed their own Python runtimes are a whole world of pain, in more ways than one). Since we know that extensions built against a statically linked CPython will run correctly against a dynamically linked one, then it probably makes sense to go down that path for the manylinux1 reference build environment. However, there's one particular test case we should investigate before committing to that path: loading manylinux1 wheels built against a statically linked CPython into a system httpd environment running the system mod_wsgi. If I've understood the problem description correctly, that *should* work, but if it doesn't, then it would represent a significant compatibility concern. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Feb 7, 2016 at 7:48 PM, Nick Coghlan
On 8 February 2016 at 08:33, Matthew Brett
wrote: It seems reasonable to build to the same compatibility level as most Debian packaged modules.
Right, one of the key things to remember with manylinux1 is that it is, *quite deliberately*, only an 80% solution to the cross-distro lack-of-ABI-compatibility problem: we want to solve the simple cases now, and then move on to figuring out how to solve the more complex cases later (and applications that embed their own Python runtimes are a whole world of pain, in more ways than one).
Since we know that extensions built against a statically linked CPython will run correctly against a dynamically linked one, then it probably makes sense to go down that path for the manylinux1 reference build environment.
However, there's one particular test case we should investigate before committing to that path: loading manylinux1 wheels built against a statically linked CPython into a system httpd environment running the system mod_wsgi. If I've understood the problem description correctly, that *should* work, but if it doesn't, then it would represent a significant compatibility concern.
That's actually a great example of the case that "ought" to fail, because first you have a host program (apache2) that uses dlopen() to load in the CPython interpreter (implicitly, by dlopen'ing mod_wsgi.so, which is linked to libpython), and then the CPython interpreter turns around and tries to use dlopen() to load extension modules. Normally, this should work if and only if the extension modules are themselves linked against libpythonX.Y.so, --enable-shared / Fedora style. However, this is not Apache's first rodeo: revision 1.12 date: 1998/07/10 18:29:50; author: rasmus; state: Exp; lines: +2 -2 Set the RTLD_GLOBAL dlopen mode parameter to allow dynamically loaded modules to load their own modules dynamically. This improves mod_perl and mod_php3 when these modules are loaded dynamically into Apache. (Confirmation that this is still true: https://apr.apache.org/docs/apr/2.0/group__apr__dso.html#gaedc8609c2bb76e5c4... -- also I ran mod_wsgi-express under LD_DEBUG=scopes and that also showed libpython2.7.so.1 getting added to the global scope.) Using RTLD_GLOBAL like this is the "wrong" thing -- it means that different Apache mods all get loaded into the same global namespace, and that means that they can have colliding symbols and step on each other's feet. E.g., this is the sole and entire reason why you can't load a python2 version of mod_wsgi and a python3 version of mod_wsgi into the same apache. But OTOH it means that Python extension modules will work even if they don't explicitly link against libpython. I also managed to track down two other programs that also follow this load-a-plugin-that-embeds-python pattern -- LibreOffice and xchat -- and they also both seem to use RTLD_GLOBAL. So even if it's the "wrong" thing, extension modules that don't explicitly link to libpython do seem to work reliably here in the world we have, and they're more compatible with Debian/Ubuntu and their massive market share, so... the whole manylinux1 strategy is nothing if not relentlessly pragmatic. I guess we should forbid linking to libpython in the PEP. [Note: I did not actually try loading any such modules into mod_wsgi or these other programs, because I have no idea how to use mod_wsgi or these other programs :-). The LD_DEBUG output is fairly definitive, but it wouldn't hurt for someone to double-check if they feel inspired...] -n -- Nathaniel J. Smith -- https://vorpus.org
On Sun, Feb 7, 2016 at 12:01 AM, Nathaniel Smith
2) if I ever embed cpython by doing dlopen("libpython2.7.so.1"), or dlopen("some_plugin_library_linked_to_libpython.so"), then the embedded cpython will not be able to load python extensions that are compiled in the Debian-style (but will be able to load python extensions compiled in the Fedora-style), because the dlopen() the loaded the python runtime and the dlopen() that loads the extension module create two different scopes that can't see each other's symbols. [I'm pretty sure this is right, but linking is arcane and probably I should write some tests to double check.]
Just to confirm, I did test this, and it is correct. Code at https://github.com/njsmith/test-link-namespaces if anyone is curious. -- Nathaniel J. Smith -- https://vorpus.org
participants (6)
-
Antoine Pitrou
-
Matthew Brett
-
Matthias Klose
-
Nathaniel Smith
-
Nick Coghlan
-
Robert T. McGibbon