Mailman 3 Ticket review: #848, leak in PyArray_DescrFromType - NumPy-Discussion

Ticket review: #848, leak in PyArray_DescrFromType

older
arccosh for complex numbers, goofy...

Michael Abbott

15 Jul 2008 15 Jul '08

7:48 a.m.

Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak. Furthermore, it remains necessary to perform an extra INCREF on typecode before calling PyArray_FromAny ... as otherwise there is the real possibility that typecode will have evaporated by the time it's passed to scalar_value later down in the routine. I attach the patch I believe remains necessary against this routine: --- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; } + Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode); I can see that there might be an argument that PyArray_FromAny has the semantics that it retains a reference to typecode unless it returns NULL ... but I don't want to go there. That would not be a good thing to rely on -- and even with those semantics the existing code still needs fixing.

Show replies by date

Michael Abbott

15 Jul 15 Jul

3:28 p.m.

On Tue, 15 Jul 2008, Michael Abbott wrote:

...

Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak.

I think I might need to explain a little more about the reason for this patch, because obviously the bug it fixes was missed the last time I posted on this bug. So here is the missing part of the patch:

...

--- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; }

+ Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr);

finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode);

On the face of it it might appear that all the DECREFs are cancelling out the first INCREF, but not so. Let's see two more lines of context:

...

src = scalar_value(robj, typecode); Py_DECREF(typecode);

Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things: 1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough; 2. Every early return path has an extra Py_XDECREF to balance the creation of typecode. I rest my case for this patch.

Charles R Harris

16 Jul 16 Jul

7:40 p.m.

On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott wrote:

...

On Tue, 15 Jul 2008, Michael Abbott wrote:

...
Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak.

I think I might need to explain a little more about the reason for this patch, because obviously the bug it fixes was missed the last time I posted on this bug.

So here is the missing part of the patch:

...
--- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; }

+ Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr);

finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode);

On the face of it it might appear that all the DECREFs are cancelling out the first INCREF, but not so. Let's see two more lines of context:

...
src = scalar_value(robj, typecode); Py_DECREF(typecode);

Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things:

1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough;

2. Every early return path has an extra Py_XDECREF to balance the creation of typecode.

I rest my case for this patch.

Yes, there does look to be a memory leak here. Not to mention a missing NULL check since PyArray_Scalar not only doesn't swallow a reference, it can't take a Null value for desc. But the whole function is such a mess I want to see if we can rewrite it to have a better flow of logic <puts on todo list> Chuck

Stéfan van der Walt

9:41 p.m.

2008/7/16 Charles R Harris :

...

Yes, there does look to be a memory leak here. Not to mention a missing NULL check since PyArray_Scalar not only doesn't swallow a reference, it can't take a Null value for desc. But the whole function is such a mess I want to see if we can rewrite it to have a better flow of logic <puts on todo list>

Can we apply the patch in the meantime? (My) TODO lists tend to get very long... Stéfan

Charles R Harris

17 Jul 17 Jul

9:13 p.m.

On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott wrote:

...

On Tue, 15 Jul 2008, Michael Abbott wrote:

...
Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak.

I think I might need to explain a little more about the reason for this patch, because obviously the bug it fixes was missed the last time I posted on this bug.

So here is the missing part of the patch:

...
--- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; }

+ Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr);

finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode);

On the face of it it might appear that all the DECREFs are cancelling out the first INCREF, but not so. Let's see two more lines of context:

...
src = scalar_value(robj, typecode); Py_DECREF(typecode);

Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things:

1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough;

2. Every early return path has an extra Py_XDECREF to balance the creation of typecode.

I rest my case for this patch. __

I still haven't convinced myself of this. By the time we hit finish, robj is NULL or holds a reference to typecode and the NULL case is taken care of up front. Later on, the reference to typecode might be decremented, perhaps leaving robj crippled, but in that case robj itself is marked for deletion upon exit. If the garbage collector can handle zero reference counts I think we are alright. I admit I haven't quite followed all the subroutines and macros, which descend into the hazy depths without the slightest bit of documentation, but at this point I'm inclined to leave things alone unless you have a test that shows a leak from this source. Chuck

...

_____________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Michael Abbott

18 Jul 18 Jul

11:15 a.m.

I'm afraid this is going to be a long one, and I don't see any good way to cut down the quoted text either... Charles, I'm going to plea with you to read what I've just written and think about it. I'm trying to make the case as clear as I can. I think the case is actually extremely simple: the existing @name@_arrtype_name code is broken. On Thu, 17 Jul 2008, Charles R Harris wrote:

...

On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott wrote:

...
...
--- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; }

+ Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr);

finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode); src = scalar_value(robj, typecode); Py_DECREF(typecode);

Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things:

1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough;

2. Every early return path has an extra Py_XDECREF to balance the creation of typecode.

I rest my case for this patch.

I still haven't convinced myself of this.

Seriously? Please bear with me for a bit then. Let me try and go back to basics. Reference counting discipline should be very simple (I did post up some links on my Wed, 9 Jul 2008 08:36:27 posting on this list). Let me try and capture this in the following principles: 1. Each routine is wholly responsible for the reference counts it creates. This responsibility can be fulfilled by: 1.a Decrementing the reference count. 1.b Returning the object to the caller (which accounts for exactly one reference count). 1.c Explicitly placing the object in another reference counted structure (similarly this accounts for another reference count, and of course creates some extra obligations which I won't discuss) 1.d Pass the object to a routine which indulges in "reference count theft". In some sense this can be thought of as an example of 1.c, and this is how PyList_SetItem and PyTuple_SetItem behave. 2. Reference counts are created by: 2.a Calling a routine which returns an object. 2.b Explicitly incrementing the reference count. 3. When the known reference count on an object reaches zero the object must be treated as lost (and any further reference to it will be undefined). In particular, a routine cannot make any assumptions about reference counts it has not created. I need to make a bit of a digression on the subject of reference count theft. I am only aware of three routines that do this: PyList_SetItem, PyTuple_SetItem and PyArray_FromAny The first two are effectively assigning the reference into a preexisting structure, so can be regarded as instances of principle 1.c. The last is a disaster area. I know (by inspecting the code) that PyArray_FromAny may choose to erase its typecode argument (by decrementing the reference), but I can't tell (without digging deeper than I care to go) whether this only occurs on the NULL return branch. However, this is a bit of a red herring. If we recognise that PyArray_FromAny steals the reference count, the remaining analysis will go through. A further note on 1.c: when an object is placed in another reference counted structure its reference count should normally be incremented -- case 1.c is really just an elision of this increment with the decrement under obligation 1.a. This normal case can be seen half way through PyArray_Scalar. Ok, preliminaries over: now to look at the code. Here is a caricature of the routine @name@_arrtype_new with the important features isolated: 1 PyArray_Descr *typecode = NULL; 2 if (...) goto finish; // _WORK@work@ 3 typecode = PyArray_DescrFromType(...); 4 if (...) { ...typecode...; goto finish; } 5 arr = PyArray_FromAny(..., typecode, ...); 6 if (...) return arr; 7 finish: 8 if (...) return robj; 9 if (...) return NULL; 10 if (typecode == NULL) typecode = PyArray_DescrFromType(...); 11 ... scalar_value(..., typecode);... 12 Py_DECREF(typecode); So let's go through this line by line. 1. We start with NULL typecode. Fine. 2. A hidden goto in a macro. Sweet. Let's have more code like this. 3. Here's the reference count we're responsible for. 4. If obj is NULL we use the typecode 5. otherwise we pass it to PyArray_FromAny. 6. The first early return 7. All paths (apart from 6) come together here. So at this point let's take stock. typecode is in one of three states: NULL (path 2, or if creation failed), allocated with a single reference count (path 4), or lost (path 5). This is not good. LET ME EMPHASISE THIS: the state of the code at the finish label is dangerous and simply broken. The original state at the finish label is indeterminate: typecode has either been lost by passing it to PyArray_FromAny (in which case we're not allowed to touch it again), or else it has reference count that we're still responsible for. There seems to be a fantasy expressed in a comment in a recent update to this routine that PyArray_Scalar has stolen a reference, but fortunately a quick code inspection (of arrayobject.c) quickly refutes this frightening possibility. So, the only way to fix the problem at (7) is to unify the two non-NULL cases. One answer is to add a DECREF at (4), but we see at (11) that we still need typecode at (7) -- so the only solution is to add an extra ADDREF just before (5). This then of course sadly means that we also need an extra DECREF at (6). PLEASE don't suggest moving the ADDREF until after (6) -- at this point typecode is lost and may have been destroyed, and relying on any possibility to the contrary is a recipe for continued screw ups. The rest is easy. Once we've established the invariant that typecode is either NULL or has a single reference count at (7) then the two early returns at (8) and (9) unfortunately need to be augmented with DECREFs. And we're done. Responses to your original comments:

...

By the time we hit finish, robj is NULL or holds a reference to typecode and the NULL case is taken care of up front. robj has nothing to do with the lifetime management of typecode, the only issue is the early return. After the finish label typecode is either NULL (no problem) or else has a single reference count that needs to be accounted for.

...

Later on, the reference to typecode might be decremented, That *might* is at the heart of the problem. You can't be so cavalier about managing references.

...

perhaps leaving robj crippled, but in that case robj itself is marked for deletion upon exit. Please ignore robj in ths discussion, it's beside the point.

...

If the garbage collector can handle zero reference counts I think we are alright. No, no, no. This is nothing to do with the garbage collector. If we screw up our reference counts here then the garbage collector isn't going to dig us out of the hole.

...

I admit I haven't quite followed all the subroutines and macros, which descend into the hazy depths without the slightest bit of documentation, but at this point I'm inclined to leave things alone unless you have a test that shows a leak from this source. Part of my point is that proper reference count discipline should not require any descent into subroutines (except for the very nasty case of reference theft, which I think is generally agreed to be a bad thing).

As for the test case, try this one (you'll need a debug build): import numpy import sys refs = 0 r = range(100) refs = sys.gettotalrefcount() for i in r: float32() print sys.gettotalrefcount() - refs You should get one leak per float32() call. I've just noticed, looking at the latest revision of the code, that somebody seems under the misapprehension that PyArray_Scalar steals reference counts. Fortunately this is not true. Maybe this is part of the confusion?

Charles R Harris

6:03 p.m.

On Fri, Jul 18, 2008 at 5:15 AM, Michael Abbott wrote:

...

I'm afraid this is going to be a long one, and I don't see any good way to cut down the quoted text either...

Charles, I'm going to plea with you to read what I've just written and think about it. I'm trying to make the case as clear as I can. I think the case is actually extremely simple: the existing @name@_arrtype_name code is broken.

<snip>

Heh. As the macro is undefined after the code is generated, it should probably be moved into the code. I would actually like to get rid of the ifdef's (almost everywhere), but that is a later stage of cleanup.

...

3. Here's the reference count we're responsible for.

Yep.

...

4. If obj is NULL we use the typecode 5. otherwise we pass it to PyArray_FromAny. 6. The first early return 7. All paths (apart from 6) come together here.

So at this point let's take stock. typecode is in one of three states: NULL (path 2, or if creation failed), allocated with a single reference count (path 4), or lost (path 5). This is not good.

It still has a single reference after 5 if PyArray_FromAny succeeded, that reference is held by arr and transferred to robj. If the transfer fails, the reference to arr is decremented and NULL returned by PyArray_Return. When arr is garbage collected the reference to typecode will be decremented.

...

LET ME EMPHASISE THIS: the state of the code at the finish label is dangerous and simply broken.

The original state at the finish label is indeterminate: typecode has either been lost by passing it to PyArray_FromAny (in which case we're not allowed to touch it again), or else it has reference count that we're still responsible for.

There seems to be a fantasy expressed in a comment in a recent update to this routine that PyArray_Scalar has stolen a reference, but fortunately a quick code inspection (of arrayobject.c) quickly refutes this frightening possibility.

No, no, Pyarray_Scalar doesn't do anything to the reference count. Where did you see otherwise?

...

So, the only way to fix the problem at (7) is to unify the two non-NULL cases. One answer is to add a DECREF at (4), but we see at (11) that we still need typecode at (7) -- so the only solution is to add an extra ADDREF just before (5). This then of course sadly means that we also need an extra DECREF at (6).

PLEASE don't suggest moving the ADDREF until after (6) -- at this point typecode is lost and may have been destroyed, and relying on any possibility to the contrary is a recipe for continued screw ups.

The rest is easy. Once we've established the invariant that typecode is either NULL or has a single reference count at (7) then the two early returns at (8) and (9) unfortunately need to be augmented with DECREFs.

And we're done.

Responses to your original comments:

...
By the time we hit finish, robj is NULL or holds a reference to typecode and the NULL case is taken care of up front.

...

robj has nothing to do with the lifetime management of typecode, the only issue is the early return. After the finish label typecode is either NULL (no problem) or else has a single reference count that needs to be accounted for.

...
Later on, the reference to typecode might be decremented, That *might* is at the heart of the problem. You can't be so cavalier about managing references.

...
perhaps leaving robj crippled, but in that case robj itself is marked for deletion upon exit. Please ignore robj in ths discussion, it's beside the point.

...
If the garbage collector can handle zero reference counts I think we are alright. No, no, no. This is nothing to do with the garbage collector. If we screw up our reference counts here then the garbage collector isn't going to dig us out of the hole.

The garbage collector destroys the object and should decrement all the references it holds. If that is not the case then there are bigger problems afoot. The finalizer for the object should hold the knowledge of what needs to be decremented.

...

...
I admit I haven't quite followed all the subroutines and macros, which descend into the hazy depths without the slightest bit of documentation, but at this point I'm inclined to leave things alone unless you have a test that shows a leak from this source.

...

Part of my point is that proper reference count discipline should not require any descent into subroutines (except for the very nasty case of reference theft, which I think is generally agreed to be a bad thing).

Agreed. But that is not the code we are looking at. My personal schedule for this sort of cleanup/refactoring looks like this. 1) Format the code into readable C. (ongoing) 2) Document the functions so we know what they do. 3) Understand the code. 4) Fix up functions starting from the bottom layers. 5) Flatten the code -- the calls go too deep for my taste and make understanding difficult. My attempts to generate a call graph have all run into problems. At that point consider the reference counting model. There aren't many people working with the C code apart from myself and Travis (who wrote the original), so I want to encourage you to keep looking at the code and working with it. However, we can't just start from scratch and we have to be pretty conservative to keep from breaking things.

...

As for the test case, try this one (you'll need a debug build):

import numpy import sys refs = 0 r = range(100) refs = sys.gettotalrefcount() for i in r: float32() print sys.gettotalrefcount() - refs

Simpler test case: import sys, gc import numpy as np def main() : t = np.dtype(np.float32) print sys.getrefcount(t) for i in range(100) : np.float32() gc.collect() print sys.getrefcount(t) if __name__ == "__main__" : main() Result $[charris@f8 ~]$ python debug.py 5 105 So there is a leak. The question is the proper fix. I want to take a closer look at PyArray_Return and also float32() and relations. Chuck

Charles R Harris

8:41 p.m.

On Fri, Jul 18, 2008 at 12:03 PM, Charles R Harris < charlesr.harris@gmail.com> wrote: <snip>

...

Simpler test case:

import sys, gc import numpy as np

def main() : t = np.dtype(np.float32) print sys.getrefcount(t) for i in range(100) : np.float32() gc.collect() print sys.getrefcount(t)

if __name__ == "__main__" : main()

Result

$[charris@f8 ~]$ python debug.py 5 105

So there is a leak. The question is the proper fix. I want to take a closer look at PyArray_Return and also float32() and relations.

The reference leak seems specific to the float32 and complex64 types called with default arguments. In [1]: import sys, gc In [2]: t = float32 In [3]: sys.getrefcount(dtype(t)) Out[3]: 4 In [4]: for i in range(10) : t(); ...: In [5]: sys.getrefcount(dtype(t)) Out[5]: 14 In [6]: for i in range(10) : t(0); ...: In [7]: sys.getrefcount(dtype(t)) Out[7]: 14 In [8]: t = complex64 In [9]: sys.getrefcount(dtype(t)) Out[9]: 4 In [10]: for i in range(10) : t(); ....: In [11]: sys.getrefcount(dtype(t)) Out[11]: 14 In [12]: t = float64 In [13]: sys.getrefcount(dtype(t)) Out[13]: 19 In [14]: for i in range(10) : t(); ....: In [15]: sys.getrefcount(dtype(t)) Out[15]: 19 This shouldn't actually leak any memory as these types are singletons, but it points up a logic flaw somewhere. Chuck

Charles R Harris

10:11 p.m.

On Fri, Jul 18, 2008 at 2:41 PM, Charles R Harris wrote: <snip> There are actually two bugs here which is confusing things. Bug 1) Deleting a numpy scalar doesn't decrement the descr reference count. Bug 2) PyArray_Return decrements the reference to arr, which in turn correctly decrements the descr on gc. So calls that go through the default (obj == NULL) correctly leave typecode with a reference, it just never gets decremented when the return object is deleted. On the other hand, if the function is called with something like float32(0), then arr is allocated, stealing a reference to typecode. PyArray_Return then deletes arr (which decrements the typecode reference), but that doesn't matter since typecode is a singleton. In this case there is no follow on stack dump when the returned object is deleted because the descr reference is not correctly decremented. BTW, both cases get returned in the first if statement after the finish label, i.e., robj is returned. Oy, what a mess. Here is a short program to follow all the reference counts. import sys, gc import numpy as np def main() : typecodes = "?bBhHiIlLqQfdgFDG" for typecode in typecodes : t = np.dtype(typecode) refcnt = sys.getrefcount(t) t.type() gc.collect() print typecode, t.name, sys.getrefcount(t) - refcnt if __name__ == "__main__" : main() Which gives the following output: $[charris@f8 ~]$ python debug.py ? bool 0 b int8 1 B uint8 1 h int16 1 H uint16 1 i int32 0 I uint32 1 l int32 0 L uint32 1 q int64 1 Q uint64 1 f float32 1 d float64 0 g float96 1 F complex64 1 D complex128 1 G complex192 1 Note that the python types, which the macro handles, don't have the reference leak problem. Chuck

Travis E. Oliphant

19 Jul 19 Jul

3:30 a.m.

Charles R Harris wrote:

...

The reference leak seems specific to the float32 and complex64 types called with default arguments.

In [1]: import sys, gc

In [2]: t = float32

In [3]: sys.getrefcount(dtype(t)) Out[3]: 4

In [4]: for i in range(10) : t(); ...:

In [5]: sys.getrefcount(dtype(t)) Out[5]: 14

In [6]: for i in range(10) : t(0); ...:

In [7]: sys.getrefcount(dtype(t)) Out[7]: 14

In [8]: t = complex64

In [9]: sys.getrefcount(dtype(t)) Out[9]: 4

In [10]: for i in range(10) : t(); ....:

In [11]: sys.getrefcount(dtype(t)) Out[11]: 14

In [12]: t = float64

In [13]: sys.getrefcount(dtype(t)) Out[13]: 19

In [14]: for i in range(10) : t(); ....:

In [15]: sys.getrefcount(dtype(t)) Out[15]: 19

This shouldn't actually leak any memory as these types are singletons, but it points up a logic flaw somewhere.

That is correct. There is no memory leak, but we do need to get it right. I appreciate the extra eyes on some of these intimate details. What can happen (after a lot of calls) is that the reference count can wrap around to 0 and then cause a funny dealloc (actually, the dealloc won't happen, but a warning will be printed). Fixing the reference counting issues has been the single biggest difficulty in converting PyArray_Descr from a C-structure to a regular Python object. It was a very large pain to begin with, and then there has been more code added since the original conversion some of which does not do reference counting correctly (mostly my fault). I've looked over ticket #848 quite a bit and would like to determine the true cause of the growing reference count which I don't believe is fixed by the rest of the patch that is submitted there. -Travis

Charles R Harris

3:49 a.m.

On Fri, Jul 18, 2008 at 9:30 PM, Travis E. Oliphant wrote:

...

Charles R Harris wrote:

...
The reference leak seems specific to the float32 and complex64 types called with default arguments.

In [1]: import sys, gc

In [2]: t = float32

In [3]: sys.getrefcount(dtype(t)) Out[3]: 4

In [4]: for i in range(10) : t(); ...:

In [5]: sys.getrefcount(dtype(t)) Out[5]: 14

In [6]: for i in range(10) : t(0); ...:

In [7]: sys.getrefcount(dtype(t)) Out[7]: 14

In [8]: t = complex64

In [9]: sys.getrefcount(dtype(t)) Out[9]: 4

In [10]: for i in range(10) : t(); ....:

In [11]: sys.getrefcount(dtype(t)) Out[11]: 14

In [12]: t = float64

In [13]: sys.getrefcount(dtype(t)) Out[13]: 19

In [14]: for i in range(10) : t(); ....:

In [15]: sys.getrefcount(dtype(t)) Out[15]: 19

This shouldn't actually leak any memory as these types are singletons, but it points up a logic flaw somewhere.

That is correct. There is no memory leak, but we do need to get it right. I appreciate the extra eyes on some of these intimate details. What can happen (after a lot of calls) is that the reference count can wrap around to 0 and then cause a funny dealloc (actually, the dealloc won't happen, but a warning will be printed).

Fixing the reference counting issues has been the single biggest difficulty in converting PyArray_Descr from a C-structure to a regular Python object. It was a very large pain to begin with, and then there has been more code added since the original conversion some of which does not do reference counting correctly (mostly my fault).

I've looked over ticket #848 quite a bit and would like to determine the true cause of the growing reference count which I don't believe is fixed by the rest of the patch that is submitted there.

I've attached a test script. Chuck

Travis E. Oliphant

4:35 a.m.

...

I've attached a test script. Thank you! It looks like with that added DECREF, the reference count leak is gone. While it was a minor issue (it should be noted that reference counting errors on the built-in data-types won't cause issues), it is nice to clean these things up when we can.

I agree that the arrtype_new function is hairy, and I apologize for that. The scalartypes.inc.src was written very quickly. I added a few more comments in the change to the function (and removed a hard-coded hackish multiply with one that takes into account the actual size of Py_UNICODE). -Travis

Gael Varoquaux

4:38 a.m.

New subject: Ticket review: #848, leak in PyArray_DescrFromType

On Fri, Jul 18, 2008 at 11:35:50PM -0500, Travis E. Oliphant wrote:

...

...
I've attached a test script. Thank you! It looks like with that added DECREF, the reference count leak is gone. While it was a minor issue (it should be noted that reference counting errors on the built-in data-types won't cause issues), it is nice to clean these things up when we can.

Yes. I think it is worth thanking all of you who are currently putting a large effort on QA. This effort is very valuable to all of us, as having a robust underlying library on which you can unquestionably rely is priceless. Gaël

Michael Abbott

2:57 p.m.

On Fri, 18 Jul 2008, Travis E. Oliphant wrote:

...

It looks like with that added DECREF, the reference count leak is gone.

I've looked at the latest head, and I agree that the problem is now solved. There is an important difference from my original solution: typecode is no longer reused after the finish label (instead it is always created anew). This makes all the difference in the world. I'm not actually convinced by the comment that's there now, which says /* typecode will be NULL */ but in truth it doesn't matter -- because of the correcly placed DECREF after the PyArray_Scalar calls the routine no longer owns typecode. If I can refer to my last message, I made the point that there wasn't a good invariant at the finish label -- we didn't know how many references to typecode we were responsible for at that point -- and I offered the solution to keep typecode. Instead you have chosen to recreate typecode, which I hadn't realised was just as good. This code is still horrible, but I don't think I want to try to understand it anymore. It'd be really nice (it'd make me feel a lot better) if you'd agree that my original patch was in fact correct. I'm not disputing the correcness of the current solution (except I think that typecode can end up being created twice, but who really cares?) but I've put a lot of effort into arguing my case, and the fact is my original patch was not wrong. Thank you.

Charles R Harris

5:44 p.m.

On Sat, Jul 19, 2008 at 8:57 AM, Michael Abbott wrote:

...

On Fri, 18 Jul 2008, Travis E. Oliphant wrote:

...
It looks like with that added DECREF, the reference count leak is gone.

I've looked at the latest head, and I agree that the problem is now solved.

There is an important difference from my original solution: typecode is no longer reused after the finish label (instead it is always created anew). This makes all the difference in the world.

I'm not actually convinced by the comment that's there now, which says /* typecode will be NULL */ but in truth it doesn't matter -- because of the correcly placed DECREF after the PyArray_Scalar calls the routine no longer owns typecode.

If I can refer to my last message, I made the point that there wasn't a good invariant at the finish label -- we didn't know how many references to typecode we were responsible for at that point -- and I offered the solution to keep typecode. Instead you have chosen to recreate typecode, which I hadn't realised was just as good.

This code is still horrible, but I don't think I want to try to understand it anymore. It'd be really nice (it'd make me feel a lot better) if you'd agree that my original patch was in fact correct. I'm not disputing the correcness of the current solution (except I think that typecode can end up being created twice, but who really cares?) but I've put a lot of effort into arguing my case, and the fact is my original patch was not wrong.

Yep, the original patch looks good now.

Travis E. Oliphant

6:27 p.m.

Michael Abbott wrote:

...

I'm not actually convinced by the comment that's there now, which says /* typecode will be NULL */ but in truth it doesn't matter -- because of the correcly placed DECREF after the PyArray_Scalar calls the routine no longer owns typecode.

I'm pretty sure that it's fine.

...

If I can refer to my last message, I made the point that there wasn't a good invariant at the finish label -- we didn't know how many references to typecode we were responsible for at that point -- and I offered the solution to keep typecode. Instead you have chosen to recreate typecode, which I hadn't realised was just as good.

I agree that this routine needs aesthetic improvement. I had hoped that someone would have improved the array scalars routines by now. I think a core issue is that to save a couple of lines of code, an inappropriate goto finish in the macro was used. This complicated the code more than the savings of a couple lines of code would justify. Really this code "grew" from a simple thing into a complicated thing as more "features" were added. This is a common issue that happens all over the place.

...

This code is still horrible, but I don't think I want to try to understand it anymore. It'd be really nice (it'd make me feel a lot better) if you'd agree that my original patch was in fact correct. I'm not disputing the correcness of the current solution (except I think that typecode can end up being created twice, but who really cares?) but I've put a lot of effort into arguing my case, and the fact is my original patch was not wrong.

From what I saw, I'm still not quite sure. Your description of reference counting was correct and it is clear you've studied the issue which is great, because there aren't that many people who understand reference counting on the C-level in Python anymore and it is still a useful skill. I'm hopeful that your description of reference counting will be something others can find and learn from. The reason I say I'm not sure is that I don't remember seeing a DECREF after the PyArray_Scalar in the obj = NULL part of the code in my looking at your patches. But, I could have just missed it. Regardless, that core piece was lost in my trying to figure out the other changes you were making to the code. From a "generic" reference counting point of view you did correctly emphasize the problem of having a reference count creation occur in an if-statement but a DECREF occur all the time in the finish: section of the code. It was really that statement: "the fantasy that PyArray_Scalar steals a reference" that tipped me off to what I consider one of the real problems to be. The fact that it was masked at the end of a long discussion about other reference counting and a "stealing" discussion that were not the core problem was distracting and ultimately not very helpful. I'm very impressed with your ability to follow these reference count issues. Especially given that you only started learning about the Python C-API a few months ago (if I remember correctly). I'm also very glad you are checking these corner cases which have not received the testing that they deserve. I hope we have not discouraged you too much from continuing to help. Your input is highly valued. -Travis

Michael Abbott

20 Jul 20 Jul

9:12 a.m.

Travis, thank you for your encouraging words. On Sat, 19 Jul 2008, Travis E. Oliphant wrote:

...

Really this code "grew" from a simple thing into a complicated thing as more "features" were added. This is a common issue that happens all over the place. Aye.

...

The reason I say I'm not sure is that I don't remember seeing a DECREF after the PyArray_Scalar in the obj = NULL part of the code in my looking at your patches. But, I could have just missed it. There wasn't -- instead, I was trying to retain the typecode (and paying the price of releasing it on all the early returns!)

...

From a "generic" reference counting point of view you did correctly emphasize the problem of having a reference count creation occur in an if-statement but a DECREF occur all the time in the finish: section of the code. Yah -- I think the core idea I was trying to get over is that of an "invariant" property at each point in the code to capture what needs to be true for the code to be correct.

...

It was really that statement: "the fantasy that PyArray_Scalar steals a reference" that tipped me off to what I consider one of the real problems to be. The fact that it was masked at the end of a long discussion about other reference counting and a "stealing" discussion that were not the core problem was distracting and ultimately not very helpful. That was the really hard bit for me. To me the issue was actually very obvious (though I didn't realise that typecode could be regenerated, which simplifies things enormously), so the problem was trying to figure out what you and Charles were not seeing. I think that's why I ended up throwing everything into the pot!

...

I'm very impressed with your ability to follow these reference count issues. Especially given that you only started learning about the Python C-API a few months ago (if I remember correctly). Alas no. I'm a bit of an old lag really, I did dabble with the Python C API quite a few years ago (2001ish maybe?). Myy roots are in computer science and then assembler (graduated 1980) before Pascal (seriously) then C, then C++ (which I now regard as a serious mistake) and finally shell script plus Python, all largely on embedded applications. I'd love the opportunity to learn and use Haskell now.

...

I'm also very glad you are checking these corner cases which have not received the testing that they deserve. I hope we have not discouraged you too much from continuing to help. Your input is highly valued.

Maybe I'll have a further play. The memory leak issue was a direct consequence of using numpy in an embedded application, and that issue's closed now, but I ought to see if this painful code can be revisited. I'm learning my way around git and have just used `git svn` to grab (and update) the numpy repository. I'm hugely impressed by it, though it is very expensive to run the first time -- it fetches every single svn revision! Hopefully that didn't overload the web server... This will make working on patches much easier.

Stéfan van der Walt

11 a.m.

2008/7/20 Michael Abbott :

...

...
I'm very impressed with your ability to follow these reference count issues. Especially given that you only started learning about the Python C-API a few months ago (if I remember correctly). Alas no. I'm a bit of an old lag really, I did dabble with the Python C API quite a few years ago (2001ish maybe?). Myy roots are in computer science and then assembler (graduated 1980) before Pascal (seriously) then C, then C++ (which I now regard as a serious mistake) and finally shell

It's scary how many of us were scarred for life by C++. If you have ten minutes free sometime, would you please consider writing up your reference counting explanation on the wiki (even if you just copy and paste that part out of your email and annotate it)? Having more eyes on the NumPy code is imperative; we need to teach more people to understand how to find these sorts of problems.

...

I'm learning my way around git and have just used `git svn` to grab (and update) the numpy repository. I'm hugely impressed by it, though it is very expensive to run the first time -- it fetches every single svn revision! Hopefully that didn't overload the web server... This will make working on patches much easier.

I hope that we can move over to a distributed revision control system sometime in the foreseeable future. From what I've seen, its model strongly encourages community interaction. Cheers Stéfan

Michael Abbott

12:27 p.m.

On Sun, 20 Jul 2008, Stéfan van der Walt wrote:

...

2008/7/20 Michael Abbott :

...
C, then C++ (which I now regard as a serious mistake) and finally shell It's scary how many of us were scarred for life by C++.

What's really annoying for me is that my most recent big project (http://sourceforge.net/projects/libera-epics) was written in C++, entirely at my own choice. It seemed like a good idea at the time.

...

If you have ten minutes free sometime, would you please consider writing up your reference counting explanation on the wiki

Good idea. Do you mean at http://scipy.org/scipy/numpy ? Somewhere under CodingStyleGuidelines?

Matthieu Brucher

12:32 p.m.

2008/7/20 Michael Abbott :

...

On Sun, 20 Jul 2008, Stéfan van der Walt wrote:

...
2008/7/20 Michael Abbott :

...
C, then C++ (which I now regard as a serious mistake) and finally shell It's scary how many of us were scarred for life by C++.

What's really annoying for me is that my most recent big project (http://sourceforge.net/projects/libera-epics) was written in C++, entirely at my own choice. It seemed like a good idea at the time.

Other are scarred for life with C and are more than happy with C++... Matthieu, who really hopes he will not ever write one more line of C code -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher

Stéfan van der Walt

4 p.m.

2008/7/20 Michael Abbott :

...

On Sun, 20 Jul 2008, Stéfan van der Walt wrote:

...
2008/7/20 Michael Abbott :

...
C, then C++ (which I now regard as a serious mistake) and finally shell It's scary how many of us were scarred for life by C++.

What's really annoying for me is that my most recent big project (http://sourceforge.net/projects/libera-epics) was written in C++, entirely at my own choice. It seemed like a good idea at the time.

...
If you have ten minutes free sometime, would you please consider writing up your reference counting explanation on the wiki

Good idea. Do you mean at http://scipy.org/scipy/numpy ? Somewhere under CodingStyleGuidelines?

Or here: http://projects.scipy.org/scipy/numpy/ under `guidelines`. Thanks! Stéfan

Charles R Harris

5:06 p.m.

On Sun, Jul 20, 2008 at 5:00 AM, Stéfan van der Walt wrote:

...

...
...
I'm very impressed with your ability to follow these reference count issues. Especially given that you only started learning about the Python C-API a few months ago (if I remember correctly). Alas no. I'm a bit of an old lag really, I did dabble with the Python C API quite a few years ago (2001ish maybe?). Myy roots are in computer science and then assembler (graduated 1980) before Pascal (seriously)

2008/7/20 Michael Abbott : then

...
C, then C++ (which I now regard as a serious mistake) and finally shell

It's scary how many of us were scarred for life by C++.

I rather like C++, especially the templates and (BOOST) smart pointers. One just has to avoid using the more exotic features, think ten or twenty times before using inheritance, and be very suspicious of operator overloading. And <cstdio> is your friend. But If you need to ration memory and worry about dynamic allocation, forget it. I wouldn't use it for drivers. Chuck

Travis E. Oliphant

19 Jul 19 Jul

3:15 a.m.

Michael Abbott wrote:

...

Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak.

Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems. In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

...

I can see that there might be an argument that PyArray_FromAny has the semantics that it retains a reference to typecode unless it returns NULL ... but I don't want to go there. That would not be a good thing to rely on -- and even with those semantics the existing code still needs fixing.

Good, that is the argument I'm making. Why don't you want to "rely on it?" Thanks for all your help. -Travis

Charles R Harris

3:31 a.m.

On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant wrote:

...

Michael Abbott wrote:

...
Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak.

Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems.

In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

Travis, There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj. The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified. Chuck

Travis E. Oliphant

3:34 a.m.

Charles R Harris wrote:

...

On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant mailto:oliphant@enthought.com> wrote:

Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. > Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems.

In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

Travis,

There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj.

The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified.

Thank you. I will direct attention there and try to clear this up tonight. I know Michael is finding problems that do need fixing. -Travis

Travis E. Oliphant

4:04 a.m.

Charles R Harris wrote:

...

On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant mailto:oliphant@enthought.com> wrote:

Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. > Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems.

In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

Travis,

There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj.

agreed! I'll commit the change.

...

The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified.

This looks fine to me. At the PyArray_Return call, the typecode reference is held by the array. When it is decref'd the typecode is decref'd appropriately as well. The resulting scalar does *not* contain a reference to typecode. The scalar C-structure has no place to put it (it's just a PyObject_HEAD and the memory for the scalar value). Michael is correct that PyArray_Scalar does not change the reference count of typecode (as the comments above that function indicates). I tried to be careful to put comments near the functions that deal with PyArray_Descr objects to describe how they affect reference counting. I also thought I put that in my book. -Travis -Travis

Charles R Harris

4:50 a.m.

On Fri, Jul 18, 2008 at 10:04 PM, Travis E. Oliphant wrote:

...

Charles R Harris wrote:

...
On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant mailto:oliphant@enthought.com> wrote:

Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. > Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems.

In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

Travis,

There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj.

agreed! I'll commit the change.

...
The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified.

This looks fine to me. At the PyArray_Return call, the typecode reference is held by the array. When it is decref'd the typecode is decref'd appropriately as well. The resulting scalar does *not* contain a reference to typecode. The scalar C-structure has no place to put it (it's just a PyObject_HEAD and the memory for the scalar value).

I was thinking of just pulling the relevant part out of PyArray_Return and including it in the function, which would make what was going on quite explicit to anyone reading the code. Then maybe a direct return of robj as I think it is always going to be a scalar at that point.

...

Michael is correct that PyArray_Scalar does not change the reference count of typecode (as the comments above that function indicates). I tried to be careful to put comments near the functions that deal with PyArray_Descr objects to describe how they affect reference counting. I also thought I put that in my book.

Yep, it was a brain fart on my part. Chuck

Travis E. Oliphant

4:05 a.m.

Charles R Harris wrote:

...

On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant mailto:oliphant@enthought.com> wrote:

Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. > Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems.

In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks.

Travis,

There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj.

agreed! I'll commit the change.

...

The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified.

5758

Age (days ago)

5763

Last active (days ago)

List overview

Download

27 comments

6 participants

participants (6)

Charles R Harris
Gael Varoquaux
Matthieu Brucher
Michael Abbott
Stéfan van der Walt
Travis E. Oliphant

Ticket review: #848, leak in PyArray_DescrFromType

Michael Abbott

Michael Abbott

Michael Abbott

Michael Abbott

Michael Abbott

Michael Abbott

tags

participants (6)