[Numpy-discussion] Possible inconsisteny in enumerated type mapping

Francesc Altet faltet at carabos.com
Wed Sep 20 03:52:26 EDT 2006


Hi,

I'm sending a message here because discussing about this in the bug tracker is 
not very comfortable. This my last try before giving up, so don't  be 
afraid ;-)

In bug #283 (http://projects.scipy.org/scipy/numpy/ticket/283) I complained 
about the fact that a numpy.int32 is being mapped in NumPy to NPY_LONG 
enumerated type and I think I failed to explain well why I think this is a 
bad thing. Now, I'll try to expose an (real life) example, in the hope that 
things will make clearer.

Realize that you are coding a C extension that receives NumPy arrays for 
saving them on-disk for a later retrieval. Realize also that an user is using 
your extension on a 32-bit platform. If she pass to this extension an array 
of type 'int32', and the extension tries to read the enumerated type (using 
array.dtype.num), it will get NPY_LONG. So, the extension use this code 
(NPY_LONG) to save the type (together with data) on-disk. Now, she send this 
data file to a teammate that works on a 64-bit machine, and tries to read the 
data using the same extension. The extension would see that the data is 
NPY_LONG type and would try to deserialize interpreting data elements as 
being as 64-bit integer (this is the size of a NPY_LONG in 64-bit platforms), 
and this is clearly wrong.

Besides this, if for making your C extension you are using a C library that is 
meant to save data in a platform-independent (say, HDF5), then, having a 
NPY_LONG will not automatically say which C library datatype maps to, because 
it only have datatypes that are of a definite size in all platforms. So, this 
is a second problem.

Of course there are workarounds for this, but my impression is that they can 
be avoided with a more sensible mapping between NumPy Python types and NumPy 
enumerated types, like:

numpy.int32 --> NPY_INT
numpy.int64 --> NPY_LONGLONG
numpy.int_  --> NPY_LONG

in all platforms, avoiding the current situation of ambiguous mapping between 
platforms.

Sorry for being so persistent, but I think the issue is worth it.

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"




More information about the NumPy-Discussion mailing list