[Numpy-discussion] bus error in embedded numpy
Sturla Molden
sturla at molden.no
Fri Nov 13 16:36:00 EST 2009
Robin skrev:
> Ah, I hadn't realised it was an OS constraint - I thought it was
> possible to unload dlls - and that was why matlab provides the clear
> function. mex automatically clears a function when you rebuild it - I
> thought that was how you can rebuild and reload mex functions without
> restarting matlab.
It is OS dependent if DLLs can be unloaded or not. In Windows you can
call CloseHandle on the DLL handle and it's gone. If you can unload a
MEX function you can also unload Python or NumPy.
But when you unload the MEX function, the DLLs loaded by the MEX are not
automatically cleared. There is no garbage collector.
In Windows, a load and an unload will result in calls to a function
called "DllMain" in the DLL. And if the DLL has other DLLs to load or
clear, that is where you need to place it. You can put a custom DllMain
function in a MEX file.
But be careful: a DLL is only loaded once, i.e. DLLs are imported as
singletons in the process. If two MEX functions load Python26.dll, they
get handles to the same instance. If you unload Python26.dll in one
MEX, it is unloaded globally from the process, so the other MEX will
crash. There is no reference counting by the OS kernel.
The solution to this type of DLL Hell in Windows is COM. A COM object is
a DLL but not a singleton. You can have multiple instances of the save
COM object in one process.
On Mac I guess your options are to either statically link everything to
the MEX file, or find a way for multiple MEX files to share Python
interpreter, e.g. implement some sort of reference counting scheme,
which by the way is what COM does on Windows.
> I really want a cross platform solution so that rules out COM.
CORBA or XMLRPC seem to be the standards. I'm not sure I would use either.
> Do you think I'm likely to run into more problems? I got the feeling
> from asking questions on IRC that embedding Python is kind of
> discouraged but I thought in this case it seemed the neatest way.
>
>
It is discouraged because you get a singleton. You can create
subinterpreters (cf. Apache's mod_python), but that will bring problems
of it's own. For example you cannot use extensions that require the
simplified GIL API (e.g. ctypes). I think this is a major design flaw of
CPython. For example with Java's JNI, you get a context pointer to the
VM, so you don't have any of these problems. But with CPython, both the
interpreter and extensions are basically implemented to be loaded as
singletons.
> I was aware of this - I thought I would be OK on the mac - at least
> Python and Numpy and my mex function are built with apple gcc although
> I'm not sure about Matlab. I guess Windows will be more difficult...
> But in any case I don't plan to pass any file handles around.
>
>
It applies to any CRT resource, not just files. Compiler is not
important, but which CRT is loaded. And if you statically link the same
CRT twice, that becomes two CRTs that cannot share resources. In
Windows, Microsoft has made sure there are multiple versions of their
CRT (msvcrt.dll, msvcr71.dll, msvcr80.dll, msvcr90.dll, ...) that cannot
share resources. And anyone not careful with this can experice crashes
at random locations.
More information about the NumPy-Discussion
mailing list