[Numpy-discussion] Memory leak/fragmentation when using np.memmap

Ralf Gommers ralf.gommers at googlemail.com
Thu May 19 19:22:44 EDT 2011


On Thu, May 19, 2011 at 1:53 AM, Pauli Virtanen <pav at iki.fi> wrote:

> On Wed, 18 May 2011 16:36:31 -0700, G Jones wrote:
> [clip]
> > As a followup, I managed to install tcmalloc as described in the article
> > I mentioned. Running the example I sent now shows a constant memory foot
> > print as expected. I am surprised such a solution was necessary.
> > Certainly others must work with such large datasets using numpy/python?
>
> Well, your example Python code works for me without any other changes,
> and it shows behavior identical to the C code.
>
> Yes, the C code does the same thing and yes, technically it's not a memory
leak. But that doesn't make it less of a problem. You can map a huge data
file but in practice, once you start looping over parts of it, all your
memory disappears. And it doesn't come back. So this issue should be
documented in the memmap docstring I think. The tcmalloc thing seems like a
useful thing to document somewhere too.

Things might depend on the version of the C library and the kernel, so it
> is quite possible that many do not see these issues.
>
> I see the same thing on OS X with gcc 4.2. My memory usage increases by
almost exactly the amount of Mb's read from disc, and can not be reduced by
deleting objects.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110520/e1a819e4/attachment.html>


More information about the NumPy-Discussion mailing list