Looping-related Memory Leak
binjured at gmail.com
Mon Jun 30 19:55:00 CEST 2008
On Jun 26, 5:38 am, Carl Banks <pavlovevide... at gmail.com> wrote:
> On Jun 26, 5:19 am, Tom Davis <binju... at gmail.com> wrote:
> > I am having a problem where a long-running function will cause a
> > memory leak / balloon for reasons I cannot figure out. Essentially, I
> > loop through a directory of pickled files, load them, and run some
> > other functions on them. In every case, each function uses only local
> > variables and I even made sure to use `del` on each variable at the
> > end of the loop. However, as the loop progresses the amount of memory
> > used steadily increases.
> Do you happen to be using a single Unpickler instance? If so, change
> it to use a different instance each time. (If you just use the module-
> level load function you are already using a different instance each
> Unpicklers hold a reference to everything they've seen, which prevents
> objects it unpickles from being garbage collected until it is
> collected itself.
> Carl Banks
Yes, I was using the module-level unpickler. I changed it with little
effect. I guess perhaps this is my misunderstanding of how GC works.
For instance, if I have `a = Obj()` and run `a.some_method()` which
generates a highly-nested local variable that cannot be easily garbage
collected, it was my assumption that either (1) completing the method
call or (2) deleting the object instance itself would automatically
destroy any variables used by said method. This does not appear to be
the case, however. Even when a variable/object's scope is destroyed,
it would seem t hat variables/objects created within that scope cannot
always be reclaimed, depending on their complexity.
To me, this seems illogical. I can understand that the GC is
reluctant to reclaim objects that have many connections to other
objects and so forth, but once those objects' scopes are gone, why
doesn't it force a reclaim? For instance, I can use timeit to create
an object instance, run a method of it, then `del` the variable used
to store the instance, but each loop thereafter continues to require
more memory and take more time. 1000 runs may take .27 usec/pass
whereas 100000 takes 2 usec/pass (Average).
More information about the Python-list