[Python-bugs-list] [ python-Bugs-494320 ] Memory leaks in 2.2c1?

noreply@sourceforge.net noreply@sourceforge.net
Tue, 18 Dec 2001 08:07:42 -0800


Bugs item #494320, was opened at 2001-12-17 13:02
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=494320&group_id=5470

Category: Python Interpreter Core
>Group: Not a Bug
Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: Ben Escoto (bescoto)
Assigned to: Nobody/Anonymous (nobody)
Summary: Memory leaks in 2.2c1?

Initial Comment:
Sorry, this won't be a very good bug report, but
perhaps you can tell me how to make it better.  I wrote
a little python project (rdiff-backup  at
http://www.stanford.edu/~bescoto/rdiff-backup).  On my
computer it takes up about 7MB of memory even for large
datasets, but several users have complained that it
takes up so much memory on their systems (hundreds of
MB) it is totally unusable.

I suspect the problem is a memory leak in Python, but
the only way I know of isolating the problem is pretty
long, and none of the users affected know Python.  I
can't have them try an earlier version because the
program depends on generators extensively.

So, any advice?  Do you think the problem could be in
python?  How could I go about trying to replicate this
error?  Of course, if I end up finding it is python,
I'll try to submit a code snippet short enough to be
helpful to you guys...


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-12-18 08:07

Message:
Logged In: YES 
user_id=31435

Glad you're unstuck, Ben!  You must use what you have 
learned only for good <wink>.

----------------------------------------------------------------------

Comment By: Ben Escoto (bescoto)
Date: 2001-12-18 00:43

Message:
Logged In: YES 
user_id=218965

It turns out there was a miscommunication about the reported
memory leak, and, just as you guys said, I got the problem
when I ran it the right way.  And the problem was ALL MY
FAULT, and had nothing to do with any memory leaks 2.2c1 may
or may not have.  So, sorry for wasting your time, but
thanks for your helpful advice which helped me find the
problem in just a few hours.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-17 14:34

Message:
Logged In: YES 
user_id=31435

Ben, there's a Big Hammer you should know about:  in a 
debug build of Python (but not a release build), the sys 
module grows a new function, sys.getobjects().  It returns 
a (Python) list of all objects in existence at the time 
it's called.  When there's a leak, this list gets bigger 
and bigger as time goes on; of course it may *also* grow 
bigger and bigger as time goes on if a program is simply 
forgetting that it's hanging on to stuff (e.g., appending 
to some bookkeeping list but forgetting to clean it up will 
make the getobjects() list grow without bound too).

It can be useful to write a little function that invokes 
getobjects(), crawls over the list to build a dict mapping 
a type to the count of the number of objects of that type 
in the list, and prints the dict.  Then call it 
periodically and stare at the output.  If there's some sort 
of leak (whether Python's fault or not), the object types 
involved stick out like a sore thumb (their counts keep 
growing).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-17 14:12

Message:
Logged In: YES 
user_id=6380

Forget their system configuration, as long as you're *sure*
that they're *in fact* using the same Python.

As Tim says, the answer is in the path through your code,
dependent on their data. Your program looks young and
probably has lots of features you haven't really used
yourself -- that's where you should look next.

I know, it ain't easy. :-(

----------------------------------------------------------------------

Comment By: Ben Escoto (bescoto)
Date: 2001-12-17 14:12

Message:
Logged In: YES 
user_id=218965

Oops, I submitted my comment before seeing Tim's.  As I
understand it, my program does not depend on the data much,
as it basically just copies files.  But it is good to know
that leaks are deterministic.  I will stop bothering you
guys and return iff I get a replicable useful leak example.


----------------------------------------------------------------------

Comment By: Ben Escoto (bescoto)
Date: 2001-12-17 14:09

Message:
Logged In: YES 
user_id=218965

I'd be willing to do all this, but, as I mentioned
initially, I'm not sure how to replicate the problem.  The
systems that leak all seem similar to mine (which doesn't
leak).  For instance, people running Suse Linux 7.3 and
Debian unstable have complained, but I seem to be fine under
Redhat 7.2 (7.1 was also ok).

Should I ask them what versions of various libraries they
are using, and then try to link python to those versions on
my system?  Which are the likely culprits?  Or is this the
wrong track altogether?  (Sorry, I don't know enough about
C/manual memory management to understand how/why the same
code would leak on one system and not on another.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-17 14:08

Message:
Logged In: YES 
user_id=31435

I can't recall a case where a Python leak occurred on only 
some systems.  Leaks are deterministic bugs:  each time a 
leaking program is run, it generally leaks exactly the same 
amounts at exactly the same times.

So what's different between your system and your users'?  
Presumably the inputs, i.e. the data getting backed up.  Do 
you have data-dependent paths in your Python code that may 
not get executed on your system with your data, but would 
on others'?  It's also possible that programs you're 
calling suffer data-dependent memory growth.

Until you can reproduce a problem yourself, there's not 
much hope.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-17 14:01

Message:
Logged In: YES 
user_id=6380

(I meant, I'll be the expert, and if it doesn't require me
to load countless megabytes of data I'll even do step 2, but
I need you to perform step 1, and I could use help with step
2.)


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-17 13:58

Message:
Logged In: YES 
user_id=6380

Typically, the way to squash a leak is:

1. get a reproducible test case that grows unbounded when
watched with "top"

2. try to whittle the test case down to something really
simple by removing code until it no longer leaks (and then
going back to the previous version :-)

3. show the test case to an expert who will make an educated
guess at where in the C code to look

Can you do this?

----------------------------------------------------------------------

Comment By: Ben Escoto (bescoto)
Date: 2001-12-17 13:30

Message:
Logged In: YES 
user_id=218965

I had been following some of the memory leak bugs, and had
hoped that an upgrade to 2.2c1 would fix things.  But I
asked the affected users to upgrade, and at least one of
them claims to have the same error with 2.2c1 (others
haven't tried upgrading yet).

I might be able to get more information out of them, but
only if the procedure is relatively painless..


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-17 13:22

Message:
Logged In: YES 
user_id=6380

Surely it could be Python.  You're using all the latest
features (nested scopes and generators, class and static
methods, ...). We're not *aware* of current leaks (we
stamped out a bunch a couple of weeks ago) but there
probably are some. It's also possible that you are creating
cycles that the garbage collector doesn't find (they would
have to involve types that don't support GC; fortunately you
don't use __del__ or __slots__).

Are you sure they aren't using it with previous 2.2 beta
versions? Some of the plugged leaks were pretty severe.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-12-17 13:17

Message:
Logged In: YES 
user_id=38388

Since you are using
* nested scopes
* static methods
and
* generators
I'd suggest to first try to find the group of features that's causing the problem.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=494320&group_id=5470