[Python-Dev] Problems with Python's default dlopen flags

David Abrahams David Abrahams" <david.abrahams@rcn.com
Sat, 4 May 2002 08:14:06 -0500


----- Original Message -----
From: "Martin v. Loewis" <martin@v.loewis.de>


> "David Abrahams" <david.abrahams@rcn.com> writes:
>
> > It seems to me that extension modules themselves should have a way
> > to report to Python that they need to be loaded with RTLD_GLOBAL.
>
> No way; to change Python in this way would be extremely
> foolish. Python was using RTLD_GLOBAL until 1.5.1, then this was
> changed in 1.5.2 due to bug reports by users. Redhat decided to revert
> this change, and consequently people run into problems with the Redhat
> Python 1.5.2 installation.

Did you misread my suggestion? I didn't say that RTLD_GLOBAL should be
the default way to load an extension module, only that there should be a
way for the module itself to determine how it's loaded.

> Here is the original problem: A Python application was using both
> Oracle and sockets, so it had the Oracle and socket modules
> loaded. Unfortunately, both provided an initsocket function (at that
> time; today the socket module provides init_socket). It so happened
> that the dynamic linker chose the initsocket definition from the
> socket module. When Oracle called its own initsocket function, the
> call ended up in the Python module, and the application crashed; this
> is painful to analyse.

Yes, that must have been. I can also imagine that it causes problems for
identically-named (sub)modules in packages (though my lack of expertise
should be apparent here again: maybe dlsym() will always grab the symbol
from the newly opened library).

> Now, people apparently want to share symbols across modules. Let me
> say that I find this desire misguided: Python extension modules are
> *not* shared libraries, they *only* interface with the Python
> interpreter.

It surprised me as well when I started developing Boost.Python, but it
turns out that people really think it's important to be able to do
component-based development on their Python extensions and occasionally
they need to be able to register things like exception translators from
one module which will be used by another module. However, as you can see
below, nothing that fancy is in play in this case...

> If you want to share symbols, use shared libraries: If
> modules A.so and B.so have symbols in common, create a shared library
> C.so that provides those symbols, and link both A.so and B.so with
> this shared library.

Guess what? That's what Boost.Python does! In fact, in the cases we're
seeing that are fixed by using RTLD_GLOBAL **there's no need for sharing
of symbols across across A.so and B.so**! The arrangement looks like
this:

        python
       /      \
   (dlopen) (dlopen)
     /          \
    |           |
    V           V
  ext1.so   ext2.so
     \         /
     (ld)    (ld)
       \     /
        \   /
        |  |
        V  V
   libboost_python.so

And in fact, I expect to ask users to do something special, like
explicitly linking between extension modules, if they want to share
exception/RTTI information between ext1 and ext2 directly. However, this
is what I didn't expect: the lack of RTLD_GLOBAL flags interferes with
the ability for ext1.so to catch C++ exceptions thrown by
libboost_python.so!

Are you suggesting that in order to do this, my users need to add yet
another .so, a thin layer between Python and the guts of their extension
modules?

> Now, people still want to share symbols across modules. For that, you
> can use CObjects: Export a CObject with an array of function pointers
> in module A (e.g. as A.API), and import that C object in module B's
> initialization code. See cStringIO and Numeric for examples.

Of course you realize that won't help with C++ exception tables...

> Now, people still want to share symbols across modules. For that, they
> can use sys.setdlopenflags.

...which leads us back to the fact that the smarts are in the wrong
place. The extension module writer knows that this particular extension
needs to share symbols, and once the module is loaded it's too late.

> It seems that this is a lose-lose situation: you can't please
> everybody. In the current state, people that want to share symbols
> can, if they really want to. With your proposed change, symbols that
> accidentally clash between unrelated extensions cause problems, and
> users of those modules can do nothing about it. Hence, the current
> state is preferable.

So give setdlopenflags a "force" option which overrides the setting
designated by the extension module. I realize it's messy (probably too
messy). If I could think of some non-messy advice for my users that
avoids a language change, I'd like that just as well.

-Dave