[Numpy-discussion] Freeing memory allocated in C
Travis Oliphant
oliphant.travis at ieee.org
Thu Apr 27 21:40:04 EDT 2006
Nick Fotopoulos wrote:
> Dear numpy-discussion,
>
> I have written a python module in C which wraps a C library (FrameL)
> in order to read data from specially formatted files into Python
> arrays. It works, but I think have a memory leak, and I can't see
> what I might be doing wrong. This Python wrapper is almost identical
> to a Matlab wrapper, but the Matlab version doesn't leak. Perhaps
> someone here can help me out?
>
> I have read in many places that to return an array, one should wrap
> with PyArray_FromDimsAndData (or more modern versions) and then return
> it without freeing the memory. Does the same principle hold for
> strings? Are the following example snippets correct?
Why don't you just use PyArray_FromDims and let NumPy manage the
memory? FromDimsAndData is only for situations where you can't manage
the memory with Python. Therefore the memory is never freed.
If you do want to have NumPy deallocate the memory when you are done,
then you have to
1) Make sure you are using the same allocator as NumPy is... _pya_malloc
is defined in arrayobject.h (in NumPy but not in Numeric)
2) Reset the array flag so that OWN_DATA is set
out2->flags |= OWN_DATA
As long as you are using the same memory allocator, this should work.
The OWN_DATA flag instructs the deallocator to free the data.
But, I would strongly suggest just using PyArray_FromDims and let NumPy
allocate the new array for you.
>
> // output2 = x-axis values relative to first data point.
> data = malloc(nData*sizeof(double));
> for(i=0; i<nData; i++) {
> data[i] = vect->startX[0]+(double)i*dt;
> }
> shape[0] = nData;
> out2 = (PyArrayObject *)
> PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);
>
> //snip
>
> // output5 = gps start time as a string
> utc = vect->GTime - vect->ULeapS + FRGPSTAI;
> out5 = malloc(200*sizeof(char));
> sprintf(out5,"Starting GPS time:%.1f UTC=%s",
> vect->GTime,FrStrGTime(utc));
>
> //snip -- Free all memory not assigned to a return object
>
> return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);
>
>
> I see in the Numpy book that I should modernize
> PyArray_FromDimsAndData, but will it be incompatible with users who
> have only Numeric?
Yes, the only issue, however, is that PyArray_FromDims and friends will
only allow int-length sizes which on 64-bit computers is not as large as
intp-length sizes. So, if you don't care about allowing large sizes
then you can use the old Numeric C-API.
>
> If the code above should not leak under your inspection, are there any
> other common places that python C modules often leak that I should check?
All of the malloc calls in your code leak. In general you should not
assume that Python will deallocate memory you have allocated. Python
uses it's own memory manager so even if you manage to arange things so
that Python will free your memory (and you really have to hack things to
do that), then you can run into trouble if you try mixing system malloc
calls with Python's deallocation.
The proper strategy for your arrays is to use PyArray_SimpleNew and then
get the data-pointer to fill using PyArray_DATA(...). The proper way to
handle strings is to create a new string (say using
PyString_FromFormat) and then return everything as objects.
/* make sure shape is defined as intp unless you don't care about 64-bit */
obj2 = PyArray_SimpleNew(1, shape, PyArray_DOUBLE);
data = (double *)PyArray_DATA(obj2)
[snip...]
out5 = PyString_FromFormat("Starting GPS time:%.1f UTC=%s",
vect->GTime,FrStrGTime(utc));
return Py_BuildValue("(NNNdNNN)",out1,out2,out3,out4,out5,out6,out7);
Make sure you use the 'N' tag so that another reference count isn't
generated. The 'O' tag will increase the reference count of your
objects by one which is is not necessarily what you want (but sometimes
you do).
Good luck,
-Travis
More information about the NumPy-Discussion
mailing list