[Numpy-discussion] Memory leak found in ndarray (I think)?

Wes McKinney wesmckinn at gmail.com
Mon Jul 12 14:22:17 EDT 2010


This one was quite a bear to track down, starting from the of course
very high level observation of "why is my application leaking memory".
I've reproduced it on Windows XP using NumPy 1.3.0 on Python 2.5 and
1.4.1 on Python 2.6 (EPD). Basically it seems that calling
.astype(bool) on an ndarray slice with object dtype is leaving a
hanging reference count, should be pretty obvious to see:

from datetime import datetime
import numpy as np
import sys

def foo(verbose=True):
    arr = np.array([datetime.today() for _ in xrange(1000)])
    arr = arr.reshape((500, 2))
    sl = arr[:, 0]

    if verbose: print 'Rec ct of index 0: %d' % sys.getrefcount(sl[0])

    for _ in xrange(10):
        foo = sl.astype(bool)

    if verbose: print 'Rec ct of index 0: %d' % sys.getrefcount(sl[0])

if __name__ == '__main__':
    foo()
    for i in xrange(10000):
        if not i % 1000: print i
        foo(verbose=False)

On my machine this bleeds about 100 MB of memory that you don't get
back-- let me know if I've misinterpreted the results. I'll happily
create a ticket on the Trac page.

Thanks,
Wes



More information about the NumPy-Discussion mailing list