[ mailman-Bugs-1077587 ] Memory Leak in Bounce Runner

SourceForge.net noreply at sourceforge.net
Thu Feb 24 02:22:33 CET 2005


Bugs item #1077587, was opened at 2004-12-02 09:02
Message generated for change (Comment added) made by prubin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1077587&group_id=103

Category: bounce detection
Group: 2.1 (stable)
Status: Open
Resolution: None
Priority: 5
Submitted By: Paul Rubin (prubin)
Assigned to: Nobody/Anonymous (nobody)
Summary: Memory Leak in Bounce Runner

Initial Comment:
Something is going bady wrong with the BouceRunner  It 
is leaking memory.  After it runs for a short time it has 
consumed hundreds of megabytes of memory.  I kill it 
with -9 and it restarts and is fine for another couple of 
hours.  Sometimes it does not restart and I have to stop 
and restart the mailman service.

This is running on a Linux box with Redhat 9 and postfix 
2.1.1 and mailman 2.1.5 and python 2.2.2

Below you will se a PS from right before I killed the 
process and below that one from a few seconds later.

If you will tell me what you need I would really like to 
get to the bottom of this, I am killing the process like 10 
times per day.  If it eats too much memory before I 
catch it then the entire system fails.

[root at tbnonline ~]# ps -U mailman  -o pid,%cpu,%
mem,stime,time,vsz,args
  PID %CPU %MEM STIME     TIME   VSZ COMMAND
12949  0.0  0.0 08:05 00:00:00  
7068 /usr/bin/python /var/mailman/bin/mailmanctl -s -q 
start
12950  0.1  0.3 08:05 00:00:02 
11176 /usr/bin/python /var/mailman/bin/qrunner --
runner=ArchRunner:0:1 -s
12951 18.9 70.8 08:05 00:08:40 
931312 /usr/bin/python /var/mailman/bin/qrunner --
runner=BounceRunner:0:1 -s
12952  0.0  0.0 08:05 00:00:00  
7040 /usr/bin/python /var/mailman/bin/qrunner --
runner=CommandRunner:0:1 -s
12953  0.0  0.1 08:05 00:00:00  
9256 /usr/bin/python /var/mailman/bin/qrunner --
runner=IncomingRunner:0:1 -s
12954  0.0  0.1 08:05 00:00:00  
7080 /usr/bin/python /var/mailman/bin/qrunner --
runner=NewsRunner:0:1 -s
12955  2.5  0.6 08:05 00:01:11 
14172 /usr/bin/python /var/mailman/bin/qrunner --
runner=OutgoingRunner:0:1 -s
12956  0.8  0.2 08:05 00:00:24 
10628 /usr/bin/python /var/mailman/bin/qrunner --
runner=VirginRunner:0:1 -s
12957  0.1  0.2 08:05 00:00:04 
13272 /usr/bin/python /var/mailman/bin/qrunner --
runner=RetryRunner:0:1 -s

[root at tbnonline ~]# ps -U mailman  -o pid,%cpu,%
mem,stime,time,vsz,args                                            
  
  PID %CPU %MEM STIME     TIME   VSZ COMMAND
12949  0.0  0.1 08:05 00:00:00  
7072 /usr/bin/python /var/mailman/bin/mailmanctl -s -q 
start
12950  0.0  0.3 08:05 00:00:02 
11176 /usr/bin/python /var/mailman/bin/qrunner --
runner=ArchRunner:0:1 -s
12952  0.0  0.0 08:05 00:00:00  
7040 /usr/bin/python /var/mailman/bin/qrunner --
runner=CommandRunner:0:1 -s
12953  0.0  0.2 08:05 00:00:00  
9256 /usr/bin/python /var/mailman/bin/qrunner --
runner=IncomingRunner:0:1 -s
12954  0.0  0.1 08:05 00:00:00  
7080 /usr/bin/python /var/mailman/bin/qrunner --
runner=NewsRunner:0:1 -s
12955  3.0  0.9 08:05 00:01:43 
13584 /usr/bin/python /var/mailman/bin/qrunner --
runner=OutgoingRunner:0:1 -s
12956  1.2  0.6 08:05 00:00:41 
10848 /usr/bin/python /var/mailman/bin/qrunner --
runner=VirginRunner:0:1 -s
12957  0.1  0.6 08:05 00:00:06 
13284 /usr/bin/python /var/mailman/bin/qrunner --
runner=RetryRunner:0:1 -s
14900 29.8  1.1 08:51 00:02:47 
13764 /usr/bin/python /var/mailman/bin/qrunner --
runner=BounceRunner:0:1 -s


----------------------------------------------------------------------

>Comment By: Paul Rubin (prubin)
Date: 2005-02-23 20:22

Message:
Logged In: YES 
user_id=91557

One additional piece, the bounce processor sits at a small 
amount of memory until the file with matching pid hits 1.5GB  
or so and then starts climbing fast.

If I kill the bounce processor, the file is abandoned., If I stop 
the mailman service, the bounce processor keeps running 
and eating memory.  If allowed to run unchecked, it will just 
eat all the memory in the system.

I cannot kill the bounce process without another starting, 
even after stopping the mailman service. 

If I freshly re-boot the server and let mailman run for a few 
minutes, the the file grows, when I stop the service the file 
shrinks back to 0 bytes, but does not get deleted.

I hope this helps.




----------------------------------------------------------------------

Comment By: Paul Rubin (prubin)
Date: 2005-02-23 19:35

Message:
Logged In: YES 
user_id=91557

Today we ran out of disk space, I have had to kill the bounce 
processor about 8 or nine times today...

I found my diskspace problem:

-rw-rw-rw-    1 mailman  mailman      1.4G Feb 23 08:48 
bounce-events-07208.pck
-rw-rw-rw-    1 mailman  mailman      931M Feb 23 09:50 
bounce-events-08307.pck
-rw-rw-rw-    1 mailman  mailman      1.1G Feb 23 10:10 
bounce-events-09037.pck
-rw-rw-rw-    1 mailman  mailman      1.4G Feb 23 10:29 
bounce-events-10251.pck
-rw-rw-rw-    1 mailman  mailman      1.6G Feb 23 13:02 
bounce-events-14874.pck
-rw-rw-rw-    1 mailman  mailman      1.4G Feb 23 14:17 
bounce-events-17525.pck
-rw-rw-rw-    1 mailman  mailman      1.6G Feb 23 14:40 
bounce-events-18973.pck
-rw-rw-rw-    1 mailman  mailman      1.6G Feb 23 15:02 
bounce-events-19879.pck
-rw-rw-rw-    1 mailman  mailman      1.5G Feb 23 15:23 
bounce-events-20584.pck

And about 100 more file from other days.

I have saved these file, and deleted the rest, does this or can 
these files tell you anything about what is going wrong?

 

----------------------------------------------------------------------

Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-12-16 20:29

Message:
Logged In: YES 
user_id=67709

I suggest stop automatic processing of bounces by setting
bounce_processing variable to 'No' at the admin/bounce page.
Looks like your server is very busy and unsubscribing
process due to the bounce score may interfering. 
You may also have to unsubscribe the problematic members
manually.



----------------------------------------------------------------------

Comment By: Paul Rubin (prubin)
Date: 2004-12-15 12:29

Message:
Logged In: YES 
user_id=91557

I do not have a specific number of bounces per day.  We 
send around 500,000 message per day with peak days around 
1,000,000, we know that we have some bad addresses, but 
at any given time there should not be more than 5,000 
bounce notices per day (full mailboxes are common)

As far as I can tell only certain bounce notices cause the 
leak.  We can go hours or days with memory being almost 
flat, then suddenly 200M vanishes in 2 or 3 minutes.

Is there any way I could 'hack' the code to somehow grab 
the information about what bounce notice is causing the 
problem.  Or to capture all bounce notices in some outof the 
way space that I could tar up an send to you for testing.  

If nothing else I could edit the aliases to copy all of the 
messages to a file and zip it down for you after a few days.

Does any of this make any sense?  

----------------------------------------------------------------------

Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-12-14 19:19

Message:
Logged In: YES 
user_id=67709

My FreeBSD4.7/Solaris8 installations have no such problems.
How many bounces you get on this system?



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1077587&group_id=103


More information about the Mailman-coders mailing list