[Distutils] Does anyone understand what's going on with libpython on Linux?

Robert T. McGibbon rmcgibbo at gmail.com
Sun Feb 7 03:25:57 EST 2016


> What are Debian/Ubuntu doing in distutils so that extensions don't link
to libpython by default?

I don't know exactly, but one way to reproduce this is simply to build the
interpreter without `--enable-shared`.

I don't know that their reasons are, but I presume that the Debian
maintainers have a well-considered reason for this design.

The PEP 513 text currently says that it's permissible for manylinux1 wheels
to link against libpythonX.Y.so. So presumably for a platform to be
manylinux1-compatible, libpythonX.Y.so should be available. I guess my
preference would be for pip to simply check as to whether or not
libpythonX.Y.so is available in its platform detection code
(pypa/pip/pull/3446).

Because Debian/Ubuntu is such a big target, instead of just bailing out and
forcing the user to install the sdist from PyPI (which is going to fail,
because Debian installations that lack libpythonX.Y.so also lack Python.h),
I would be +1 for adding some kind of message for this case that says,
"maybe you should `sudo apt-get install python-dev` to get these fancy new
wheels rolling."

-Robert

On Sun, Feb 7, 2016 at 12:01 AM, Nathaniel Smith <njs at pobox.com> wrote:

> So we found another variation between how different distros build
> CPython [1], and I'm very confused.
>
> Fedora (for example) turns out to work the way I naively expected:
> taking py27 as our example, they have:
> - libpython2.7.so.1.0 contains the actual python runtime
> - /usr/bin/python2.7 is a tiny (~7 KiB) executable that links to
> libpython2.7.so.1 to do the actual work; the main python package
> depends on the libpython package
> - python extension module packages depend on the libpython package,
> and contain extension modules linked against libpython2.7.so.1
> - python extension modules compiled locally get linked against
> libpython2.7.so.1 by default
>
> Debian/Ubuntu do things differently:
> - libpython2.7.so.1.0 exists and contains the full python runtime, but
> is not installed by default
> - /usr/bin/python2.7 *also* contains a *second* copy of the full
> python runtime; there is no dependency relationship between these, and
> you don't even get libpython2.7.so.1.0 installed unless you explicitly
> request it or it gets pulled in through some other dependency
> - most python extension module packages do *not* depend on the
> libpython2.7 package, and contain extension modules that are *not*
> linked against libpython2.7.so.1.0 (but there are exceptions!)
> - python extension modules compiled locally do *not* get linked
> against libpython2.7.so.1 by default.
>
> The only things that seem to link against libpython2.7.so.1.0 in debian
> are:
> a) other packages that embed python (e.g. gnucash, paraview, perf, ...)
> b) some minority of python packages (e.g. the PySide/QtOpenGL.so
> module is one that I found that directly links to libpython2.7.so.1.0)
>
> I guess that the reason this works is that according to ELF linking
> rules, the symbols defined in the main executable, or in the
> transitive closure of the libraries that the main executable is linked
> to via DT_NEEDED entries, are all injected into the global scope of
> any dlopen'ed libraries.
>
> Uh, let me try saying that again.
>
> When you dlopen() a library -- like, for example, a python extension
> module -- then the extension automatically gets access to any symbols
> that are exported from either (a) the main executable itself, or (b)
> any of the libraries that are listed if you run 'ldd <the main
> executable>'. It also gets access to any symbols that are exported by
> itself, or any of the libraries listed if you run 'ldd <the dlopen'ed
> library>'. OTOH it does *not* get access to any symbols exported by
> other libraries that get dlopen'ed -- each dlopen specifically creates
> its own "scope".
>
> So the reason this works is that Debian's /usr/bin/python2.7 itself
> exports all the standard Python C ABI symbols, so any extension module
> that it loads automatically get access to the CPython ABI, even if
> they don't explicitly link to it. And programs like gnucash are linked
> directly to libpython2.7.so.1, so they also end up exporting the
> CPython ABI to any libraries that they dlopen.
>
> But, it seems to me that there are two problems with the Debian/Ubuntu
> way of doing things:
> 1) it's rather wasteful of space, since there are two complete
> independent copies of the whole CPython runtime (one inside
> /usr/bin/python2.7, the other inside libpython2.7.so.1).
> 2) if I ever embed cpython by doing dlopen("libpython2.7.so.1"), or
> dlopen("some_plugin_library_linked_to_libpython.so"), then the
> embedded cpython will not be able to load python extensions that are
> compiled in the Debian-style (but will be able to load python
> extensions compiled in the Fedora-style), because the dlopen() the
> loaded the python runtime and the dlopen() that loads the extension
> module create two different scopes that can't see each other's
> symbols. [I'm pretty sure this is right, but linking is arcane and
> probably I should write some tests to double check.]
>
> I guess (2) might be why some of Debian's extension modules do link to
> libpython2.7.so.1 directly? Or maybe that's just a bug?
>
> Is there any positive reason in favor of the Debian style approach?
> Clearly someone put some work into setting things up this way, so
> there must be some motivation, but I'm not sure what it is?
>
> The immediate problem for us is that if a manylinux1 wheel links to
> libpythonX.Y.so (Fedora-style), and then it gets run on a Debian
> system that doesn't have libpythonX.Y.so installed, it will crash
> with:
>
> ImportError: libpython2.7.so.1.0: cannot open shared object file: No
> such file or directory
>
> Maybe this is okay and the solution is to tell people that they need
> to 'apt install libpython2.7'. In a sense this isn't even a
> regression, because every system that is capable of installing a
> binary extension from an sdist has python2.7-dev installed, which
> depends on libpython2.7 --> therefore every system that used to be
> able to do 'pip install somebinary' with sdists will still be able to
> do it with manylinux1 builds.
>
> The alternative is to declare that manylinux1 extensions should not
> link to libpython. This should I believe work fine on both
> Debian-style and Fedora-style installations -- though the PySide
> example, and the theoretical issue with embedding python through
> dlopen, both give me some pause.
>
> Two more questions:
> - What are Debian/Ubuntu doing in distutils so that extensions don't
> link to libpython by default? If we do go with the option of saying
> that manylinux extensions shouldn't link to libpython, then that's
> something auditwheel *can* fix up, but it'd be even nicer if we could
> set up the docker image to get it right in the first place.
>
> - Can/should Debian/Ubuntu switch to the Fedora model? Obviously it
> would take quite some time before a generic platform like manylinux
> could assume that this had happened, but it does seem better to me...?
> And if it's going to happen at all it might be nice to get the switch
> into 16.04 LTS? Of course that's probably ambitious, even if I'm not
> missing some reason why the Debian/Ubuntu model is actually
> advantageous.
>
> -n
>
> [1] https://github.com/pypa/manylinux/issues/30
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>



-- 
-Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20160207/59dd902e/attachment-0001.html>


More information about the Distutils-SIG mailing list