[C++-sig] Crash at shutdown
talljimbo at gmail.com
Sat Feb 11 03:23:43 CET 2012
On 02/10/2012 04:57 PM, Gennadiy Rozental wrote:
> I need help with a crash I observe both on windows and Linux. I admit my use
> case is somewhat complicated, but that's reality. So, my problem consists form 3
> components. Let's call them A, B, and C:
> A is Python *extension* library (dll/so). This is really just an infrastructure
> layer which helps with developing Python extensions like B+C. I am using Python
> 2.6 BTW.
> B is entry point into huge extension library which consists of multiple dlls/sos
> (one of them is C). This is also dll/so.
> C is library within my project, which is responsible for communication with
> Python *embedded* scripts. C is using Boost.Python to export some number of
> symbols into Python, but the issue occurs with just one as well. C is linked
> with Boost.Python library and Python interpreter
> Now imagine that I have a Python script which does essentially this:
> import A
> I run this script and observe a crash at Python shutdown. Here are more details
> about what going on in this script.
> Line 1 loads Python extension
> Line 2 loads library B using dlopen
> Line 3 loads library C using dlopen
> Object B is being released. This is done in two steps. First we invoke
> B.shutdown() which unloads C from memory using dlclose and next B is unloaded
> from memory using dlclose
> A is unloaded
> Python interpreter is shut down
> The crash occurs at the very last step in rather obscure place within Python
> interpreter shutdown routines.
> I found there are several "changes", which prevent the crash from happening:
> 1. I can stop exporting Boost.Python symbols in C
> 2. I can skip unloading C from B.shutdown()
> 3. I can link in C with B. This results in line 2 loading both B and C together.
> Line 3 does nothing. B.shutdown() does nothing and C is unloaded along with B
> when we call dlclose on B.
> At this point I am inclined to believe that something like this is happening:
> when I execute Boost.Python export statement in C it adds some records in
> Boost.Python and Python interpreter. When C is unloaded from memory somehow
> these records are not being cleaned up. By the time we get to clean this records
> C is already unloaded from memory and either Boost.Python or Python interpreter
> corrupt the memory.
This scenario sounds very possible, and a good candidate for those
"records" is the Boost.Python converter registry. That's supposed to be
a global registry (in the boost_python shared library) that's shared by
all modules, and it mostly works quite well when the only dlopens
involved are the ones the Python interpreter uses when importing
modules. I could definitely imagine it being corrupted by doing a
dlclose on the shared library that exported some classes using Boost.Python.
I'm pretty sure there's no programmatic way to remove something from the
registry, and to add such a feature you'd have to modify the
Boost.Python sources and recompile the shared library. If you're
willing to do that, that might be a way out.
If it's at all possible, I think the safest bet would be to refactor
things so that everything that gets exported to Python happens within a
separate module that would be imported by the Python scripts, so you
only rely on Python's own dlopen calls when it involves Boost.Python
wrappers. If that's not feasible, you might try putting the wrapper
code in a function in a library that never gets unloaded, even if that
function is called by some library that may be unloaded.
Good luck - sounds like a hard problem to solve!
More information about the Cplusplus-sig