Some questions about PyArray_As2D.
Hi, I've two question about PyArray_As2D function: 1) Looking at the source code I realized that the prototype given in the manual is different from that of the source: PyArray_As2D(PyObject *op, char **ptr, int *m, int *n, int type) (from manual) PyArray_As2D(PyObject **op, char ***ptr, int *d1, int *d2, int typecode) (from source) As far as I can understand the source version is correct while the manual's one is not. Am I wrong? 2) Next in the source code I've found the following memory allocation: data = (char **)malloc(n*sizeof(char *)); without checking if malloc return NULL or not. As far as I know it's not safe, even if it's very unlikely that this malloc would fail. Anyway in that case the following: ... data[i] = ap->data + i*ap->strides[0]; ... would cause the function to abort. Again am I wrong? Thanks in advance, Andrea. --- Andrea Riciputi mailto:andrea.riciputi@libero.it "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman)
Andrea Riciputi
As far as I can understand the source version is correct while the manual's one is not. Am I wrong?
The source code always wins the argument :-)
2) Next in the source code I've found the following memory allocation:
data = (char **)malloc(n*sizeof(char *));
without checking if malloc return NULL or not. As far as I know it's not safe, even if it's very unlikely that this malloc would fail.
Right. Did you submit a bug report? Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------
On Wednesday, Dec 11, 2002, at 14:33 Europe/Rome, Konrad Hinsen wrote:
2) Next in the source code I've found the following memory allocation:
data = (char **)malloc(n*sizeof(char *));
without checking if malloc return NULL or not. As far as I know it's not safe, even if it's very unlikely that this malloc would fail.
Right. Did you submit a bug report?
Just done. By the way reading the code again and again I got another
question. Here is the complete code fragment:
extern int PyArray_As2D(PyObject **op, char ***ptr, int *d1, int *d2,
int typecode) {
PyArrayObject *ap;
int i, n;
char **data;
if ((ap = (PyArrayObject *)PyArray_ContiguousFromObject(*op,
typecode, 2, 2)) == NULL)
return -1;
n = ap->dimensions[0];
data = (char **)malloc(n*sizeof(char *));
for(i=0; i
Just done. By the way reading the code again and again I got another question. Here is the complete code fragment:
extern int PyArray_As2D(PyObject **op, char ***ptr, int *d1, int *d2, int typecode) { PyArrayObject *ap; int i, n; char **data;
if ((ap = (PyArrayObject *)PyArray_ContiguousFromObject(*op, typecode, 2, 2)) == NULL) return -1;
n = ap->dimensions[0]; data = (char **)malloc(n*sizeof(char *)); for(i=0; i
data + i*ap->strides[0]; } *op = (PyObject *)ap; <=== It doesn't sound good to me!!! *ptr = data; *d1 = ap->dimensions[0]; *d2 = ap->dimensions[1]; return 0; } Looking at the marked line I started wondering about the fate of the object originally pointed by op. Without explicitly deallocating it you lost any chance to reach it. It turns out in a memory leakage.
No. op is an input parameter and thus a "borrowed" reference. It might not be the best coding style to reuse that variable name for something unrelated later on, but it doesn't cause a memory leak.
I'm very interested in this topic because I'm writing some Python extensions and I'd like to understand how I have to handle all these objects correctly. So how "long" does a Python object live? How can I release correctly the allocated memory?
There is an explanation of this topic in chapter 1.10 of the Python "Extending and Embedding" manual. Basically, you have to increase an object's reference counter if you want to keep a reference to it beyond the end of the currently running function, and to decrease the counter if you want to release that reference. Whenever the reference count goes to zero, the object is deleted. With very few exceptions, functions that take an object as input to work on (but not store) don't increase the reference counter. They don't have to, because nothing can happen to the object until the function terminates. Special precautions need to be taken for multithreading, but these go much beyond object reference counting. As a rule of thumb, you don't have to worry about reference counting at all as long as you only write C functions that act on existing objects. It becomes an issue when you define your own extension types in C. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------
On Wednesday, Dec 11, 2002, at 19:43 Europe/Rome, Konrad Hinsen wrote:
No. op is an input parameter and thus a "borrowed" reference. It might not be the best coding style to reuse that variable name for something unrelated later on, but it doesn't cause a memory leak.
I'm sorry but I don't agree. I've read your answer many times and re-read the code but I still think to be right. The function prototype says: extern int PyArray_As2D(PyObject **op, char ***ptr, int *d1, int *d2, int typecode) and a call to this function looks like this: PyObject *input; double **result; int nrows, ncols; PyArray_As2D((&input, (char ***) &(result), &(nrows), &(ncols), PyArray_DOUBLE) Now when you call the function in this way op is a pointer to the pointer that points to your original ArrayObject. It allows you to change the memory address which is originally pointed by input. And it is exactly what you do with the instruction: *op = (PyObject *)ap; So you create a new PyArrayObject (allocating another memory area) by means of PyArray_ContiguousFromObject and names it ap, then you modify the memory address which op points to with the above instruction. Now *op (that is input) points to another memory region, but you haven't deallocated the previous pointed memory and it _is_ a memory leak! These are my two cents, comments are welcome. Cheers, Andrea. --- Andrea Riciputi mailto:andrea.riciputi@libero.it "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman)
The function prototype says:
extern int PyArray_As2D(PyObject **op, char ***ptr, int *d1, int *d2, int typecode)
You are right that my first comment doesn't quite apply, I hadn't noticed the two stars... But the code is still OK, it is the calling routine that is responsible for avoiding memory leaks.
is exactly what you do with the instruction:
*op = (PyObject *)ap;
So you create a new PyArrayObject (allocating another memory area) by means of PyArray_ContiguousFromObject and names it ap, then you modify the memory address which op points to with the above instruction. Now
Right. This routine cannot know if that reference contains an "owned" or a "borrowed" reference. In the first case, the reference counter of the original array must be decreased, in the second case not. Assume for example that the calling routine got an array passed in as a borrowed reference: void foo(PyObject *array) { /* I want a 2D version! */ PyArray_As2D(&array, ...) } In that case, decreasing the reference count in PyArray_As2D would be disastrous. In the case of an owned reference, the calling routine must keep a copy of the pointer, call PyArray_As2D, and then decrease the reference counter on the original pointer. This ought to be easier. In fact, the interface of PyArray_As2D is pretty badly designed. It should not overwrite the original pointer, but return a new one. I also don't see the point in passing all the array information in additional arguments - they are easy to obtain from the new array object. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------
participants (2)
-
Andrea Riciputi
-
Konrad Hinsen