
Hi there, given the following (simplified) scenario: typedef struct { PyObject_HEAD float bar[10]; } FooObject; I want to be able to set and retrieve the elements of bar from Python using e.g.
foo = Foo() foo.bar[4] = 1.23 x = foo.bar[4]
I have chosen an approach using 'PyArray_FromDimsAndData'. In fact I programmed it after studying 'arrayobject.c', namely the part in 'array_getattr' where the 'flat' attribute is accessed. foo_getattr(FooObject *self, char *name) { if (!strcmp(name, "bar")) { int n=10; PyObject *bar = PyArray_FromDimsAndData(1, &n, PyArray_FLOAT, (char*)self->bar); if (bar == NULL) return NULL; return bar; } return Py_FindMethod(foo_methods, (PyObject*)self, name); } And it works! :-) BUT how about refcounts here? 'PyArray_FromDimsAndData' will return an array which only contains a reference to foo's original bar array; that's why I can both set and access the latter the way described. And no memory leak is created. But what if I create a reference to foo.bar, and later delete foo, i.e.
b = foo.bar del foo
Now the data pointer in b refers to freed data! In the mentioned 'array_getattr' this apeears to be solved by increasing the refcount; in the above example this would mean 'Py_INCREF(self)' before returning 'bar'. Then if deleting 'foo', its memory is not freed because the refcount is not zero. But AFAICS in this case (as well as in the Numeric code) the INCREF prevents the object from EVER being freed. Who would DECREF the object? Or am I misunderstanding something here? In my actual code I can perfectly live with the above solution because I only need to access foo's data using 'foo.bar[i]' and probably never need to create a reference to 'bar' which might survive the actual 'foo' object. However, I want to program it the 'clean' way; any hints on how to do it properly would therefore be highly welcome. Cheers, Joachim

Joachim Saul <list@jsaul.de> writes:
But what if I create a reference to foo.bar, and later delete foo, i.e.
b = foo.bar del foo
Now the data pointer in b refers to freed data! In the mentioned
And that is why the condition for using PyArray_FromDimsAndData is that the data space passed is not freed before the end of the process.
survive the actual 'foo' object. However, I want to program it the 'clean' way; any hints on how to do it properly would therefore be highly welcome.
I see only one clean solution: implement your own array-like object that represents foo.bar. This object would keep a reference to foo and release it when it is itself destroyed. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------

On 2 Dec 2002, Konrad Hinsen wrote:
Joachim Saul <list@jsaul.de> writes:
But what if I create a reference to foo.bar, and later delete foo, i.e.
b = foo.bar del foo
Now the data pointer in b refers to freed data! In the mentioned
Forgive me for jumping in. But why should the data be deleted when you do this? Shouldn't del foo merely decrease the reference count of foo.bar? Because there are still outstanding references to foo.bar (i.e. b) then the data itself shouldn't be freed. Perhaps I don't understand the question well enough. -Travis

On Mon, Dec 02, 2002 at 11:59:22AM +0100, Joachim Saul wrote:
Hi there,
given the following (simplified) scenario:
typedef struct { PyObject_HEAD float bar[10]; } FooObject;
I want to be able to set and retrieve the elements of bar from Python using e.g.
foo = Foo() foo.bar[4] = 1.23 x = foo.bar[4]
I have chosen an approach using 'PyArray_FromDimsAndData'. In fact I programmed it after studying 'arrayobject.c', namely the part in 'array_getattr' where the 'flat' attribute is accessed.
foo_getattr(FooObject *self, char *name) { if (!strcmp(name, "bar")) { int n=10; PyObject *bar = PyArray_FromDimsAndData(1, &n, PyArray_FLOAT, (char*)self->bar); if (bar == NULL) return NULL; return bar; }
return Py_FindMethod(foo_methods, (PyObject*)self, name); }
And it works! :-)
BUT how about refcounts here? 'PyArray_FromDimsAndData' will return an array which only contains a reference to foo's original bar array; that's why I can both set and access the latter the way described. And no memory leak is created.
But what if I create a reference to foo.bar, and later delete foo, i.e.
b = foo.bar del foo
Now the data pointer in b refers to freed data! In the mentioned 'array_getattr' this apeears to be solved by increasing the refcount; in the above example this would mean 'Py_INCREF(self)' before returning 'bar'. Then if deleting 'foo', its memory is not freed because the refcount is not zero. But AFAICS in this case (as well as in the Numeric code) the INCREF prevents the object from EVER being freed. Who would DECREF the object?
Something similiar came up a few weeks ago: how do you pass data owned by something else as a Numeric array, while keeping track of when to delete the data? It's so simple I almost kicked myself when I saw it, from the code at http://pobox.com/~kragen/sw/arrayfrombuffer/ which allows you to use memory-mapped files as arrays. The idea is that a PyArrayObject has a member 'base', which is DECREF'd when the array is deallocated. The idea is for when arrays are slices of other arrays, deallocating the slice will decrease the reference count of the original. However, we can subvert this by using our own base, that knows how to deallocate our data. In your case, the DECREF'ing is all you need, so you could use foo_getattr(FooObject *self, char *name) { if (!strcmp(name, "bar")) { int n=10; PyObject *bar = PyArray_FromDimsAndData(1, &n, PyArray_FLOAT, (char*)self->bar); if (bar == NULL) return NULL; /***** new stuff here *******/ Py_INCREF(self); ((PyArrayObject *)bar)->base = self; /***********/ return bar; } return Py_FindMethod(foo_methods, (PyObject*)self, name); } So, now with
b = foo.bar del foo b will still reference the original foo object. Now, do del b and foo's data will be DECREF'd, freeing it if b had the only reference.
This can be extended: say you've allocated memory from some memory pool that has to be freed with, say, 'my_pool_free'. You can create a Numeric array from this without copying by PyArrayObject *A = (PyArrayObject *)PyArray_FromDimsAndData(1, dims, PyArray_DOUBLE, (char *)data); A->base = PyCObject_FromVoidPtr(data, my_pool_free); Then A will be a PyArrayObject, that, when the last reference is deleted, will DECREF A->base, which will free the memory. Easy, huh? -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/cookedm/ |cookedm@physics.mcmaster.ca

cookedm@arbutus.physics.mcmaster.ca (David M. Cooke) writes:
The idea is that a PyArrayObject has a member 'base', which is DECREF'd when the array is deallocated. The idea is for when arrays are slices of
Indeed, but this is an undocumented implementation feature. Use at your own risk. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------

On Tue, Dec 03, 2002 at 11:03:20AM +0100, Konrad Hinsen wrote:
cookedm@arbutus.physics.mcmaster.ca (David M. Cooke) writes:
The idea is that a PyArrayObject has a member 'base', which is DECREF'd when the array is deallocated. The idea is for when arrays are slices of
Indeed, but this is an undocumented implementation feature. Use at your own risk.
Nope, documented implementation feature. From the C API documentation, PyObject * base Used internally in arrays that are created as slices of other arrays. Since the new array shares its data area with the old one, the original array's reference count is incremented. When the subarray is garbage collected, the base array's reference count is decremented. Looking through Numeric's code, nothing requires base to be an array object. Besides, Numeric isn't going to change substantially before Numarray replaces it (although I don't know the analogue of this trick in Numarray). The usefulness of this trick (IMHO) outweighs the small chance of the interface changing. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/cookedm/ |cookedm@physics.mcmaster.ca
participants (4)
-
cookedm@arbutus.physics.mcmaster.ca
-
Joachim Saul
-
Konrad Hinsen
-
Travis Oliphant