[Numpy-discussion] bus error in embedded numpy

Sturla Molden sturla at molden.no
Fri Nov 13 16:36:00 EST 2009


Robin skrev:
> Ah, I hadn't realised it was an OS constraint - I thought it was
> possible to unload dlls - and that was why matlab provides the clear
> function. mex automatically clears a function when you rebuild it - I
> thought that was how you can rebuild and reload mex functions without
> restarting matlab.
It is OS dependent if DLLs can be unloaded or not. In Windows you can 
call CloseHandle on the DLL handle and it's gone. If you can unload a 
MEX function you can also unload Python or NumPy.

But when you unload the MEX function, the DLLs loaded by the MEX are not 
automatically cleared. There is no garbage collector.

In Windows, a load and an unload will result in calls to a function 
called "DllMain" in the DLL. And if the DLL has other DLLs to load or 
clear, that is where you need to place it. You can put a custom DllMain 
function in a MEX file.

But be careful: a DLL is only loaded once, i.e. DLLs are imported as 
singletons in the process. If two MEX functions load Python26.dll, they 
get handles to the same instance. If you unload Python26.dll in one 
MEX,  it is unloaded globally from the process, so the other MEX will 
crash. There is no reference counting by the OS kernel.

The solution to this type of DLL Hell in Windows is COM. A COM object is 
a DLL but not a singleton. You can have multiple instances of the save 
COM object in one process.

On Mac I guess your options are to either statically link everything to 
the MEX file, or find a way for multiple MEX files to share Python 
interpreter, e.g. implement some sort of reference counting scheme, 
which by the way is what COM does on Windows.

> I really want a cross platform solution so that rules out COM. 
CORBA or XMLRPC seem to be the standards. I'm not sure I would use either.


> Do you think I'm likely to run into more problems? I got the feeling
> from asking questions on IRC that embedding Python is kind of
> discouraged but I thought in this case it seemed the neatest way.
>
>   
It is discouraged because you get a singleton. You can create 
subinterpreters (cf. Apache's mod_python), but that will bring problems 
of it's own. For example you cannot use extensions that require the 
simplified GIL API (e.g. ctypes). I think this is a major design flaw of 
CPython. For example with Java's JNI, you get a context pointer to the 
VM, so you don't have any of these problems. But with CPython, both the 
interpreter and extensions are basically implemented to be loaded as 
singletons.



> I was aware of this - I thought I would be OK on the mac - at least
> Python and Numpy and my mex function are built with apple gcc although
> I'm not sure about Matlab. I guess Windows will be more difficult...
> But in any case I don't plan to pass any file handles around.
>
>   
It applies to any CRT resource, not just files. Compiler is not 
important, but which CRT is loaded. And if you statically link the same 
CRT twice, that becomes two CRTs that cannot share resources. In 
Windows, Microsoft has made sure there are multiple versions of their 
CRT (msvcrt.dll, msvcr71.dll, msvcr80.dll, msvcr90.dll, ...) that cannot 
share resources. And anyone not careful with this can experice crashes 
at random locations.











More information about the NumPy-Discussion mailing list