[Python-bugs-list] [ python-Bugs-782369 ] Massive memory leak in array module

SourceForge.net noreply@sourceforge.net
Tue, 05 Aug 2003 04:25:43 -0700


Bugs item #782369, was opened at 2003-08-03 11:15
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=782369&group_id=5470

Category: Extension Modules
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: Raymond Hettinger (rhettinger)
Summary: Massive memory leak in array module

Initial Comment:
Simplified variant of a test program from c.l.py:

.import array
.
.def test():    
.    for x in xrange(1000000):
.        b = array.array('B', range(120, 120+64))
.
.test()

On Win98 under 2.3, it zooms to 200MB in a few 
seconds.  No growth under 2.2.3 (it stays under 2MB 
until it finishes).

Problem goes away if

.        b = range(120, 120+64)

instead.  Problem also goes away if

.        b = array.array('B', range(64))

instead.  I expect (but ave  not confirmed) that in the 
latter cases it's leaking refcounts (to the immortal "little 
integers").

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-08-05 06:25

Message:
Logged In: YES 
user_id=80475

Committed as Modules/arraymodule.c 2.91

I don't recall any other changes from Py_ListGetItem to 
PySequence_GetItem but will scan for them this weekend.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-08-04 16:34

Message:
Logged In: YES 
user_id=33168

Patch looks good to me.  Do you know if there were any other
changes from Py_ListGetItem to Py_SequenceGetItem?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-08-04 01:01

Message:
Logged In: YES 
user_id=80475

Attaching a patch for a second review.

The refcnt difference between Py_SequenceGetItem and 
Py_ListGetItem bit me in the behind.  Added the DECREF 
where Neal suggested but used a regular DECREF instead of 
an XDECREF because it is preceded by a guard.

The question then arose as to where the decreffing would 
affect the subsequent code, so I took another look at the two 
following sections.  It appears to me (and I really could use a 
second confirmation on this) that the String and UniCode 
sections were unnecessarily recopying the initializer when 
the length was greater than zero.  So, I changed the "if" to 
an "else if" to preclude the double copy and to resolve the 
question about the validity of the decref.

Also, I added the NULL check for the getitem as Tim 
suggested.

With the patch, all test cases (including Skip's test) and the 
OP's code run fine.  

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-08-03 18:32

Message:
Logged In: YES 
user_id=44345

Sorry - thought I was saving Raymond about 30 seconds of
work.  I'll remember in the future.  Also, I checked in a version
with the required hasattr() check.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-08-03 18:21

Message:
Logged In: YES 
user_id=31435

Neal, yup, that's the leak.  PySequence_GetItem() either 
returns NULL or a new reference, and in the latter case the 
result must always be decrefed; PyList_GetItem() returned a 
borrowed reference (or NULL).  The code should also check 
PySequence_GetItem() for an error (NULL) return, although 
the laziness of not checking for that was probably inherited 
from earlier code.

Skip, tests that use a refcount-revealing function should be 
conditionalized on hasattr(sys, 'name_of_function'), to avoid 
creating headaches for Jython testing.  You shouldn't check 
in failing tests independent of fixes:  the intent is that 
someone getting a fresh checkout should never see a test 
failure.

Leaving assigned to Raymond for now.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-08-03 18:02

Message:
Logged In: YES 
user_id=44345

Added a test case to test_array.py (1.25)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-08-03 12:50

Message:
Logged In: YES 
user_id=33168

I think the problem may be due to the change on line 1760 of
arraymodule from PyList_GetItem (2.2) to PySequence_GetItem
(2.3) in 2.88 by Raymond.  

Adding a Py_XDECREF(v) at line 1765 "fixes" the problem, but
I don't think that's necessarily the correct solution for
all types.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=782369&group_id=5470