NumPy frombuffer giving nonsense values when reading C float array on Windows

Tue Jul 26 14:09:47 EDT 2016

 On Tue, Jul 26, 2016 at 12:06 PM,  <urschrei at gmail.com> wrote:
> I'm using ctypes to interface with a binary which returns a void pointer (ctypes c_void_p) to a nested 64-bit float array:

If this comes from a function result, are you certain that its restype
is ctypes.c_void_p? I commonly see typos here such as setting
"restypes" instead of "restype".

> [[1.0, 2.0], [3.0, 4.0], … ]
> then return the pointer so it can be freed
>
> I'm using the following code to de-reference it:
>
> # a 10-element array
> shape = (10, 2)
> array_size = np.prod(shape)
> mem_size = 8 * array_size
> array_str = ctypes.string_at(ptr, mem_size)
> # convert to NumPy array,and copy to a list
> ls = np.frombuffer(array_str, dtype="float64", count=array_size).reshape(shape).tolist()
> # return pointer so it can be freed
> drop_array(ptr)
> return ls
>
> This works correctly and consistently on Linux and OSX using NumPy 1.11.0, but fails on
> Windows 32 bit and 64-bit about 50% of the time, returning nonsense values. Am I doing
> something wrong? Is there a better way to do this?

numpy.ctypeslib facilitates working with ctypes functions, pointers
and arrays via the factory functions as_array, as_ctypes, and
ndpointer.

ndpointer creates a c_void_p subclass that overrides the default
from_param method to allow passing arrays as arguments to ctypes
functions and also implements the _check_retval_ hook to automatically
convert a pointer result to a numpy array.

The from_param method validates an array argument to ensure it has the
proper data type, shape, and memory layout. For example:

    g = ctypes.CDLL(None) # Unix only
    Base = np.ctypeslib.ndpointer(dtype='B', shape=(4,))

    # strchr example
    g.strchr.argtypes = (Base, ctypes.c_char)
    g.strchr.restype = ctypes.c_char_p

    d = np.array(list(b'012\0'), dtype='B')
    e = np.array(list(b'0123\0'), dtype='B') # wrong shape

    >>> g.strchr(d, b'0'[0])
    b'012'
    >>> g.strchr(e, b'0'[0])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ctypes.ArgumentError: argument 1: <class 'TypeError'>:
    array must have shape (4,)

The _check_retval_ hook of an ndpointer calls numpy.array on the
result of a function. Its __array_interface__ property is used to
create a copy with the defined data type and shape. For example:

    g.strchr.restype = Base

    >>> d.ctypes._as_parameter_ # source address
    c_void_p(24657952)
    >>> a = g.strchr(d, b'0'[0])
    >>> a
    array([48, 49, 50,  0], dtype=uint8)
    >>> a.ctypes._as_parameter_ # it's a copy
    c_void_p(19303504)

As a copy, the array owns its data:

    >>> a.flags
      C_CONTIGUOUS : True
      F_CONTIGUOUS : True
      OWNDATA : True
      WRITEABLE : True
      ALIGNED : True
      UPDATEIFCOPY : False

You can subclass the ndpointer type to have _check_retval_ instead
return a view of the result (i.e. copy=False), which may be desirable
for a large result array but probably isn't worth it for small arrays.
For example:

    class Result(Base):
        @classmethod
        def _check_retval_(cls, result):
            return np.array(result, copy=False)

    g.strchr.restype = Result

    >>> a = g.strchr(d, b'0'[0])
    >>> a.ctypes._as_parameter_ # it's NOT a copy
    c_void_p(24657952)

Because it's not a copy, the array view doesn't own the data, but note
that it's not a read-only view:

    >>> a.flags
      C_CONTIGUOUS : True
      F_CONTIGUOUS : True
      OWNDATA : False
      WRITEABLE : True
      ALIGNED : True
      UPDATEIFCOPY : False