I'm afraid this is going to be a long one, and I don't see any good way to cut down the quoted text either... Charles, I'm going to plea with you to read what I've just written and think about it. I'm trying to make the case as clear as I can. I think the case is actually extremely simple: the existing @name@_arrtype_name code is broken. On Thu, 17 Jul 2008, Charles R Harris wrote:
On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott
wrote: --- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; }
+ Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr);
finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_@TYPE@); dest = scalar_value(obj, typecode); src = scalar_value(robj, typecode); Py_DECREF(typecode);
Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things:
1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough;
2. Every early return path has an extra Py_XDECREF to balance the creation of typecode.
I rest my case for this patch.
I still haven't convinced myself of this.
Seriously? Please bear with me for a bit then. Let me try and go back to basics. Reference counting discipline should be very simple (I did post up some links on my Wed, 9 Jul 2008 08:36:27 posting on this list). Let me try and capture this in the following principles: 1. Each routine is wholly responsible for the reference counts it creates. This responsibility can be fulfilled by: 1.a Decrementing the reference count. 1.b Returning the object to the caller (which accounts for exactly one reference count). 1.c Explicitly placing the object in another reference counted structure (similarly this accounts for another reference count, and of course creates some extra obligations which I won't discuss) 1.d Pass the object to a routine which indulges in "reference count theft". In some sense this can be thought of as an example of 1.c, and this is how PyList_SetItem and PyTuple_SetItem behave. 2. Reference counts are created by: 2.a Calling a routine which returns an object. 2.b Explicitly incrementing the reference count. 3. When the known reference count on an object reaches zero the object must be treated as lost (and any further reference to it will be undefined). In particular, a routine cannot make any assumptions about reference counts it has not created. I need to make a bit of a digression on the subject of reference count theft. I am only aware of three routines that do this: PyList_SetItem, PyTuple_SetItem and PyArray_FromAny The first two are effectively assigning the reference into a preexisting structure, so can be regarded as instances of principle 1.c. The last is a disaster area. I know (by inspecting the code) that PyArray_FromAny may choose to erase its typecode argument (by decrementing the reference), but I can't tell (without digging deeper than I care to go) whether this only occurs on the NULL return branch. However, this is a bit of a red herring. If we recognise that PyArray_FromAny steals the reference count, the remaining analysis will go through. A further note on 1.c: when an object is placed in another reference counted structure its reference count should normally be incremented -- case 1.c is really just an elision of this increment with the decrement under obligation 1.a. This normal case can be seen half way through PyArray_Scalar. Ok, preliminaries over: now to look at the code. Here is a caricature of the routine @name@_arrtype_new with the important features isolated: 1 PyArray_Descr *typecode = NULL; 2 if (...) goto finish; // _WORK@work@ 3 typecode = PyArray_DescrFromType(...); 4 if (...) { ...typecode...; goto finish; } 5 arr = PyArray_FromAny(..., typecode, ...); 6 if (...) return arr; 7 finish: 8 if (...) return robj; 9 if (...) return NULL; 10 if (typecode == NULL) typecode = PyArray_DescrFromType(...); 11 ... scalar_value(..., typecode);... 12 Py_DECREF(typecode); So let's go through this line by line. 1. We start with NULL typecode. Fine. 2. A hidden goto in a macro. Sweet. Let's have more code like this. 3. Here's the reference count we're responsible for. 4. If obj is NULL we use the typecode 5. otherwise we pass it to PyArray_FromAny. 6. The first early return 7. All paths (apart from 6) come together here. So at this point let's take stock. typecode is in one of three states: NULL (path 2, or if creation failed), allocated with a single reference count (path 4), or lost (path 5). This is not good. LET ME EMPHASISE THIS: the state of the code at the finish label is dangerous and simply broken. The original state at the finish label is indeterminate: typecode has either been lost by passing it to PyArray_FromAny (in which case we're not allowed to touch it again), or else it has reference count that we're still responsible for. There seems to be a fantasy expressed in a comment in a recent update to this routine that PyArray_Scalar has stolen a reference, but fortunately a quick code inspection (of arrayobject.c) quickly refutes this frightening possibility. So, the only way to fix the problem at (7) is to unify the two non-NULL cases. One answer is to add a DECREF at (4), but we see at (11) that we still need typecode at (7) -- so the only solution is to add an extra ADDREF just before (5). This then of course sadly means that we also need an extra DECREF at (6). PLEASE don't suggest moving the ADDREF until after (6) -- at this point typecode is lost and may have been destroyed, and relying on any possibility to the contrary is a recipe for continued screw ups. The rest is easy. Once we've established the invariant that typecode is either NULL or has a single reference count at (7) then the two early returns at (8) and (9) unfortunately need to be augmented with DECREFs. And we're done. Responses to your original comments:
By the time we hit finish, robj is NULL or holds a reference to typecode and the NULL case is taken care of up front. robj has nothing to do with the lifetime management of typecode, the only issue is the early return. After the finish label typecode is either NULL (no problem) or else has a single reference count that needs to be accounted for.
Later on, the reference to typecode might be decremented, That *might* is at the heart of the problem. You can't be so cavalier about managing references.
perhaps leaving robj crippled, but in that case robj itself is marked for deletion upon exit. Please ignore robj in ths discussion, it's beside the point.
If the garbage collector can handle zero reference counts I think we are alright. No, no, no. This is nothing to do with the garbage collector. If we screw up our reference counts here then the garbage collector isn't going to dig us out of the hole.
I admit I haven't quite followed all the subroutines and macros, which descend into the hazy depths without the slightest bit of documentation, but at this point I'm inclined to leave things alone unless you have a test that shows a leak from this source. Part of my point is that proper reference count discipline should not require any descent into subroutines (except for the very nasty case of reference theft, which I think is generally agreed to be a bad thing).
As for the test case, try this one (you'll need a debug build): import numpy import sys refs = 0 r = range(100) refs = sys.gettotalrefcount() for i in r: float32() print sys.gettotalrefcount() - refs You should get one leak per float32() call. I've just noticed, looking at the latest revision of the code, that somebody seems under the misapprehension that PyArray_Scalar steals reference counts. Fortunately this is not true. Maybe this is part of the confusion?