How can one arbitrarily assumes that an ndarray owns its data ? More explicitly, I have some temporary home-made C structure that holds a pointer to an array. I prepare (using Cython) an numpy.ndarray using the PyArray_NewFromDescr function. I can delete my temporary C structure without freeing the memory holding array, but I wish the numpy.ndarray becomes the owner of the data. How can do I do such thing ? -- Fabrice Silva
On Thu, Dec 15, 2011 at 16:17, Fabrice Silva <silva@lma.cnrs-mrs.fr> wrote:
How can one arbitrarily assumes that an ndarray owns its data ?
More explicitly, I have some temporary home-made C structure that holds a pointer to an array. I prepare (using Cython) an numpy.ndarray using the PyArray_NewFromDescr function. I can delete my temporary C structure without freeing the memory holding array, but I wish the numpy.ndarray becomes the owner of the data.
How can do I do such thing ?
You can't, really. numpy-owned arrays will be deallocated with numpy's deallocator. This may not be the appropriate deallocator for memory that your library allocated. If at all possible, I recommend using numpy to create the ndarray and pass that pointer to your library. Sometimes the library's API gets in the way of this. Otherwise, copy the data. Devs, looking into this, I noticed that we use PyDataMem_NEW() and PyDataMem_FREE() (which is #defined to malloc() and free()) for handling the data pointer. Why aren't we using the appropriate PyMem_*() functions (or the PyArray_*() memory functions which default to using the PyMem_*() implementations)? Using the PyMem_*() functions lets the Python memory manager have an accurate idea how much memory is being used, which can be important for the large amounts of memory that numpy arrays can consume. I assume this is intentional design. I just want to know the rationale for it and would like it documented. I can certainly understand if it causes bad interactions with the garbage collector, say (though hiding information from the GC seems like a suboptimal approach). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
[snip]
Devs, looking into this, I noticed that we use PyDataMem_NEW() and PyDataMem_FREE() (which is #defined to malloc() and free()) for handling the data pointer. Why aren't we using the appropriate PyMem_*() functions (or the PyArray_*() memory functions which default to using the PyMem_*() implementations)? Using the PyMem_*() functions lets the Python memory manager have an accurate idea how much memory is being used, which can be important for the large amounts of memory that numpy arrays can consume.
I assume this is intentional design. I just want to know the rationale for it and would like it documented. I can certainly understand if it causes bad interactions with the garbage collector, say (though hiding information from the GC seems like a suboptimal approach).
The macros were created so that the allocator could be switched when we understood better the benefits and trade-offs of using the Python memory manager versus the system memory manager (or one specialized for NumPy). So, the only intentional design was to use the macros (the decision to make them point to malloc and free was more because that's what was being done before than explicit decision. -Travis
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
--- Travis Oliphant Enthought, Inc. oliphant@enthought.com 1-512-536-1057 http://www.enthought.com
On Thu, Dec 15, 2011 at 04:36:24PM +0000, Robert Kern wrote:
More explicitly, I have some temporary home-made C structure that holds a pointer to an array. I prepare (using Cython) an numpy.ndarray using the PyArray_NewFromDescr function. I can delete my temporary C structure without freeing the memory holding array, but I wish the numpy.ndarray becomes the owner of the data.
How can do I do such thing ?
You can't, really. numpy-owned arrays will be deallocated with numpy's deallocator. This may not be the appropriate deallocator for memory that your library allocated.
Coming late to the battle, but I recently followed the same route, and came to similar conclusions: using the owndata flag is not suited, and you will need you own deallocator. I implemented a demo code showing all the steps to implement this strategy to bind an existing C library with Cython in https://gist.github.com/1249305 in particular, the deallocator is in https://gist.github.com/1249305#file_cython_wrapper.pyx I hope that this code sample is usefull. Gael
Am 15.12.2011 um 17:17 schrieb Fabrice Silva:
How can one arbitrarily assumes that an ndarray owns its data ?
More explicitly, I have some temporary home-made C structure that holds a pointer to an array. I prepare (using Cython) an numpy.ndarray using the PyArray_NewFromDescr function. I can delete my temporary C structure without freeing the memory holding array, but I wish the numpy.ndarray becomes the owner of the data.
How can do I do such thing ?
There is an excellent blog entry from Travis Oliphant, that describes how to create a ndarray from existing data without copy: http://blog.enthought.com/?p=62 The created array does not actually own the data, but its base attribute points to an object, which frees the memory if the numpy array gets deallocated. I guess this is the behavior you want to achieve. Here is a cython implementation (for a uint8 array) Gregor """ see 'NumPy arrays with pre-allocated memory', http://blog.enthought.com/?p=62 """ import numpy as np from numpy cimport import_array, ndarray, npy_intp, set_array_base, PyArray_SimpleNewFromData, NPY_DOUBLE, NPY_INT, NPY_UINT8 cdef extern from "stdlib.h": void* malloc(int size) void free(void *ptr) cdef class MemoryReleaser: cdef void* memory def __cinit__(self): self.memory = NULL def __dealloc__(self): if self.memory: #release memory free(self.memory) print "memory released", hex(<long>self.memory) cdef MemoryReleaser MemoryReleaserFactory(void* ptr): cdef MemoryReleaser mr = MemoryReleaser.__new__(MemoryReleaser) mr.memory = ptr return mr cdef ndarray frompointer(void* ptr, int nbytes): import_array() #cdef int dims[1] #dims[0] = nbytes cdef npy_intp dims = <npy_intp>nbytes cdef ndarray arr = PyArray_SimpleNewFromData(1, &dims, NPY_UINT8, ptr) #TODO: check for error set_array_base(arr, MemoryReleaserFactory(ptr)) return arr def test_new_array_from_pointer(): nbytes = 16 cdef void* mem = malloc(nbytes) print "memory allocated", hex(<long>mem) return frompointer(mem, nbytes)
Le jeudi 15 décembre 2011 à 18:09 +0100, Gregor Thalhammer a écrit :
There is an excellent blog entry from Travis Oliphant, that describes how to create a ndarray from existing data without copy: http://blog.enthought.com/?p=62 The created array does not actually own the data, but its base attribute points to an object, which frees the memory if the numpy array gets deallocated. I guess this is the behavior you want to achieve. Here is a cython implementation (for a uint8 array)
Even better: the addendum! http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-f... Within cython: cimport numpy numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor)) Seems OK. Any objections about that ? -- Fabrice Silva
Am 16.12.2011 um 11:53 schrieb Fabrice Silva:
Le jeudi 15 décembre 2011 à 18:09 +0100, Gregor Thalhammer a écrit :
There is an excellent blog entry from Travis Oliphant, that describes how to create a ndarray from existing data without copy: http://blog.enthought.com/?p=62 The created array does not actually own the data, but its base attribute points to an object, which frees the memory if the numpy array gets deallocated. I guess this is the behavior you want to achieve. Here is a cython implementation (for a uint8 array)
Even better: the addendum! http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-f...
Within cython: cimport numpy numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor))
Seems OK. Any objections about that ?
This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2. Gregor
Le vendredi 16 décembre 2011 à 15:33 +0100, Gregor Thalhammer a écrit :
Even better: the addendum! http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-f...
Within cython: cimport numpy numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor))
Seems OK. Any objections about that ?
This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2.
My guess is then that the PyCapsule object is the way to go... -- Fabrice Silva
On 12/16/2011 04:16 PM, Fabrice Silva wrote:
Le vendredi 16 décembre 2011 à 15:33 +0100, Gregor Thalhammer a écrit :
Even better: the addendum! http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-f...
Within cython: cimport numpy numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor))
Seems OK. Any objections about that ?
This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2.
My guess is then that the PyCapsule object is the way to go...
Another way: With recent NumPy you should be able to do something like this in Cython cdef class SomeBufferWrapper: ... def __getbuffer__(self, ...): ... def __releasebuffer__(self, ...): .. arr = np.asarray(SomeBufferWrapper(buf)) and then __releasebuffer__ will be called then `arr` goes out of use. See Cython docs. Dag
participants (6)
-
Dag Sverre Seljebotn -
Fabrice Silva -
Gael Varoquaux -
Gregor Thalhammer -
Robert Kern -
Travis Oliphant