
The attached a ~30 line short demo file that unexpectedly eats more memory on each iteration. I have tried it with python on windows and on the mac, and in both cases I typically run out of memory before the loop completes. As best as I can tell, the final scipy.fftpack.ifft2 call in the loop malloc's a 256Mb block of memory inside that is never referenced in python and is never freed. With 32-bit EPD 7.1-2 (numpy 1.6.1; scipy 0.9.0) on the Mac, I can see 256Mb blocks that seem to be created during each call to scipy.fftpack.ifft2. With EPD 7.2-2 (numpy 1.6.1; scipy 0.10.0), there are fewer, but larger, allocated blocks of memory and the test does complete, but the malloc's grow to a total use of 2.2Gb. Could someone confirm for me that this is a real scipy bug and not user error? Is there a ticket mechanism? Brian

I'm assuming that you are expecting the address of CC to remain constant. As written, it should not. fft2 returns a new array, as does ifftshift and ifft2. You can fill an existing array with the answer by creating ffta and CC outside the loop and then filling them with the CC[:,:] = blah() syntax. The modified script is attached. WIth these changes, the script still uses a lot of memory (high water mark of 2.4 GB), but the memory usage does not grow without bound. At least for me, using the same platform (mac, epd 7.2). -matt On Sat, Mar 24, 2012 at 12:35 PM, Brian Toby <brian.toby@anl.gov> wrote:
The attached a ~30 line short demo file that unexpectedly eats more memory on each iteration. I have tried it with python on windows and on the mac, and in both cases I typically run out of memory before the loop completes. As best as I can tell, the final scipy.fftpack.ifft2 call in the loop malloc's a 256Mb block of memory inside that is never referenced in python and is never freed.
With 32-bit EPD 7.1-2 (numpy 1.6.1; scipy 0.9.0) on the Mac, I can see 256Mb blocks that seem to be created during each call to scipy.fftpack.ifft2. With EPD 7.2-2 (numpy 1.6.1; scipy 0.10.0), there are fewer, but larger, allocated blocks of memory and the test does complete, but the malloc's grow to a total use of 2.2Gb.
Could someone confirm for me that this is a real scipy bug and not user error? Is there a ticket mechanism?
Brian
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

Matt Terry <matt.terry <at> gmail.com> writes:
I'm assuming that you are expecting the address of CC to remain constant. As written, it should not. fft2 returns a new array, as does ifftshift and ifft2. You can fill an existing array with the answer by creating ffta and CC outside the loop and then filling them with the CC[:,:] = blah() syntax. The modified script is attached.
WIth these changes, the script still uses a lot of memory (high water mark of 2.4 GB), but the memory usage does not grow without bound. At least for me, using the same platform (mac, epd 7.2).
-matt
That is a very nice trick to force reuse of memory, but it makes even more clear there is a memory leak in scipy.fftpack. I was expecting in my previous code that python would garbage collect and delete unreferenced objects, but with your change arrays are reused so even that is not needed. Either way this code should not require any additional memory after the first iteration, particularly now, since ffta and CC are reused. However, this is clearly not what I see. Below is a map of the major allocated memory use. Note that it increases by 320Mb after each iteration. I have found a work-around: use of the numpy.fft routines in place of scipy.fftpack. When I make this change, the memory use stays constant at 432Mb after every iteration. Brian BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 1st iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC [ 815.8M] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 2nd iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-5696c000 [320.0M] MALLOC [ 1.1G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 3rd iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-6a96c000 [640.0M] MALLOC [ 1.4G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 4th iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-7e96c000 [960.0M] MALLOC [ 1.7G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 5th iteration) MALLOC_LARGE 0492c000-0e96c000 [160.2M] MALLOC_LARGE 1296c000-2296c000 [256.0M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-8296c000 [ 1.0G] MALLOC_LARGE c0000000-d0000000 [256.0M] MALLOC [ 2.0G]

On Sat, Mar 24, 2012 at 10:22 PM, Brian Toby <brian.toby@anl.gov> wrote:
Matt Terry <matt.terry <at> gmail.com> writes:
I'm assuming that you are expecting the address of CC to remain constant. As written, it should not. fft2 returns a new array, as does ifftshift and ifft2. You can fill an existing array with the answer by creating ffta and CC outside the loop and then filling them with the CC[:,:] = blah() syntax. The modified script is attached.
WIth these changes, the script still uses a lot of memory (high water mark of 2.4 GB), but the memory usage does not grow without bound. At least for me, using the same platform (mac, epd 7.2).
That is a very nice trick to force reuse of memory, but it makes even more clear there is a memory leak in scipy.fftpack. I was expecting in my previous code that python would garbage collect and delete unreferenced objects, but with your change arrays are reused so even that is not needed.
I feel I should point out that they are only "reused" up to a point: sf.ifft2 is here returning an array *that ifft2 is allocating*. CC[:, :] = sf.ifft2(CC) will copy the contents of that newly allocated array into the array currently referenced by CC, but the array allocated by ifft2 will need to be garbage collected before that memory is freed. The code can't "know" that it's output array is going to be the LHS of that expression, because the Python interpreter has no way of doing that kind of introspection. (The way to do this in your own code if you don't want memory allocated is to pass in an output array.) You can force a garbage collection at every iteration by sticking "gc.collect()" in the loop. I see that fft2 and ifft2 has an "overwrite_x" parameter, which is what you actually want, but it is *quite* broken (normally these things only work with Fortran-contiguous inputs, but this isn't working at all:
a = numpy.array(numpy.random.randn(2, 2), order='F') a array([[ 0.18671055, -1.01763466], [-0.40909016, -0.43029087]]) a.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False b = scipy.fftpack.fft2(a, overwrite_x=True) b is a False b.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False a array([[ 0.18671055, -1.01763466], [-0.40909016, -0.43029087]])

I feel I should point out that they are only "reused" up to a point: sf.ifft2 is here returning an array *that ifft2 is allocating*.
True enough, but before the output array is copied over the input array and the output array will then be unreferenced and be a candidate for cleanup.
You can force a garbage collection at every iteration by sticking "gc.collect()" in the loop.
Good suggestion. That would show a memory leak, as opposed to a lag in garbage collection. Indeed when I strip the script down and do that (see below), I still see the memory use grow on every iteration, now by 256Mb. This stranded block has a different address than the one returned by ifft2, but that is the only routine that could be creating it. import numpy as np import scipy.fftpack as sf import pdb import gc from numpy.version import version as np_version from scipy.version import version as sp_version print "numpy %s; scipy %s" % (np_version, sp_version) ref = np.random.rand(4096,4096) for ii in range(1,20): print 'loop=',ii ffta = sf.fft2(ref) print 'ffta address=',hex(ffta.ctypes.data)[2:] gc.collect() pdb.set_trace() I found the trak site and put in a ticket.
participants (3)
-
Brian Toby
-
David Warde-Farley
-
Matt Terry