Leaking memory problem

Hi! I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem.. I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError. I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations.. I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ? I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything. Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere? Thanks for any help! Jaakko

I added allocation tracking tools to numpy for exactly this reason. They are not very well documented, but you can see how to use them here: https://github.com/numpy/numpy/tree/master/tools/allocation_tracking Ray On Mon, Feb 25, 2013 at 8:41 AM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Hi!
I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem..
I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError.
I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations..
I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ?
I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything.
Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere?
Thanks for any help! Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Mon, Feb 25, 2013 at 8:41 AM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Hi!
I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem..
I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError.
I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations..
There are some stories where pythons garbage collection is too slow to kick in. try to call gc.collect in the loop to see if it helps. roughly what I remember: collection works by the number of objects, if you have a few very large arrays, then memory increases, but garbage collection doesn't start yet. Josef
I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ?
I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything.
Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere?
Thanks for any help! Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Josef's suggestion is the first thing I'd try. Are you doing any of this in C ? It is easy to end up duplicating memory that you need to Py_DECREF . In the C debugger you should be able to monitor the ref count of your python objects. btw, for manual tracking of reference counts you can do, sys.getrefcount it has come handy for me every once in a while but usually the garbage collector is all I've needed besides patience. The way I usually run the gc is by doing gc.enable( ) gc.set_debug(gc.DEBUG_LEAK) as pretty much my first lines and then after everything is said and done I do something along the lines of, gc.collect( ) for x in gc.garbage: s = str(x) print type(x) you'd have to set up your program to quite before it runs out of memory of course but I understand you get to run for quite a few iterations before failure. Raul On 25/02/2013 9:03 AM, josef.pktd@gmail.com wrote:
Hi!
I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem..
I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError.
I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations.. There are some stories where pythons garbage collection is too slow to kick in.
On Mon, Feb 25, 2013 at 8:41 AM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote: try to call gc.collect in the loop to see if it helps.
roughly what I remember: collection works by the number of objects, if you have a few very large arrays, then memory increases, but garbage collection doesn't start yet.
Josef
I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ?
I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything.
Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere?
Thanks for any help! Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Is this with 1.7? There see a few memory leak fixes in 1.7, so if you aren't using that you should try it to be sure. And if you are using it, then there is one known memory leak bug in 1.7 that you might want to check whether you're hitting: https://github.com/numpy/numpy/issues/2969 -n On 25 Feb 2013 13:41, "Jaakko Luttinen" <jaakko.luttinen@aalto.fi> wrote:
Hi!
I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem..
I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError.
I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations..
I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ?
I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything.
Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere?
Thanks for any help! Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Thanks for all the answers, they were helpful! I was using 1.7.0 and now installed from git: https://github.com/numpy/numpy/archive/master.zip And it looks like the memory leak is gone, so I guess I was hitting that known memory leak bug. Thanks! -Jaakko On 02/26/2013 09:04 AM, Nathaniel Smith wrote:
Is this with 1.7? There see a few memory leak fixes in 1.7, so if you aren't using that you should try it to be sure. And if you are using it, then there is one known memory leak bug in 1.7 that you might want to check whether you're hitting: https://github.com/numpy/numpy/issues/2969
-n
On 25 Feb 2013 13:41, "Jaakko Luttinen" <jaakko.luttinen@aalto.fi <mailto:jaakko.luttinen@aalto.fi>> wrote:
Hi!
I was wondering if anyone could help me in finding a memory leak problem with NumPy. My project is quite massive and I haven't been able to construct a simple example which would reproduce the problem..
I have an iterative algorithm which should not increase the memory usage as the iteration progresses. However, after the first iteration, 1GB of memory is used and it steadily increases until at about 100-200 iterations 8GB is used and the program exits with MemoryError.
I have a collection of objects which contain large arrays. In each iteration, the objects are updated in turns by re-computing the arrays they contain. The number of arrays and their sizes are constant (do not change during the iteration). So the memory usage should not increase, and I'm a bit confused, how can the program run out of memory if it can easily compute at least a few iterations..
I've tried to use Pympler, but I've understood that it doesn't show the memory usage of NumPy arrays.. ?
I also tried gc.set_debug(gc.DEBUG_UNCOLLECTABLE) and then printing gc.garbage at each iteration, but that doesn't show anything.
Does anyone have any ideas how to debug this kind of memory leak bug? And how to find out whether the bug is in my code, NumPy or elsewhere?
Thanks for any help! Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (5)
-
Jaakko Luttinen
-
josef.pktd@gmail.com
-
Nathaniel Smith
-
Raul Cota
-
Thouis (Ray) Jones