
An update - I've upgraded to the latest stable python (2.5.2) and its made no difference to the process growth: Config: Solaris 10 x86 Python 2.5.2 Mailman 2.1.9 (8 Incoming queue runners - the leak rate increases with this) SpamAssassin 3.2.5
At this point I am looking for ways to isolate the suspected memory leak - I am looking at using dtrace: http://blogs.sun.com/sanjeevb/date/200506
Any other tips appreciated!
Initial (immediately after a /etc/init.d/mailman restart): last pid: 10330; load averages: 0.45, 0.19, 0.15 09:13:33 93 processes: 92 sleeping, 1 on cpu CPU states: 98.6% idle, 0.4% user, 1.0% kernel, 0.0% iowait, 0.0% swap Memory: 1640M real, 1160M free, 444M swap in use, 2779M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND 10314 mailman 1 59 0 9612K 7132K sleep 0:00 0.35% python 10303 mailman 1 59 0 9604K 7080K sleep 0:00 0.15% python 10305 mailman 1 59 0 9596K 7056K sleep 0:00 0.14% python 10304 mailman 1 59 0 9572K 7036K sleep 0:00 0.14% python 10311 mailman 1 59 0 9572K 7016K sleep 0:00 0.13% python 10310 mailman 1 59 0 9572K 7016K sleep 0:00 0.13% python 10306 mailman 1 59 0 9556K 7020K sleep 0:00 0.14% python 10302 mailman 1 59 0 9548K 6940K sleep 0:00 0.13% python 10319 mailman 1 59 0 9516K 6884K sleep 0:00 0.15% python 10312 mailman 1 59 0 9508K 6860K sleep 0:00 0.12% python 10321 mailman 1 59 0 9500K 6852K sleep 0:00 0.14% python 10309 mailman 1 59 0 9500K 6852K sleep 0:00 0.13% python 10307 mailman 1 59 0 9500K 6852K sleep 0:00 0.13% python 10308 mailman 1 59 0 9500K 6852K sleep 0:00 0.12% python 10313 mailman 1 59 0 9500K 6852K sleep 0:00 0.12% python
After 8 hours: last pid: 9878; load averages: 0.14, 0.12, 0.13 09:12:18 97 processes: 96 sleeping, 1 on cpu CPU states: 97.2% idle, 1.2% user, 1.6% kernel, 0.0% iowait, 0.0% swap Memory: 1640M real, 179M free, 2121M swap in use, 1100M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND 10123 mailman 1 59 0 314M 311M sleep 1:57 0.02% python 10131 mailman 1 59 0 310M 307M sleep 1:35 0.01% python 10124 mailman 1 59 0 309M 78M sleep 0:45 0.10% python 10134 mailman 1 59 0 307M 81M sleep 1:27 0.01% python 10125 mailman 1 59 0 307M 79M sleep 0:42 0.01% python 10133 mailman 1 59 0 44M 41M sleep 0:14 0.01% python 10122 mailman 1 59 0 34M 30M sleep 0:43 0.39% python 10127 mailman 1 59 0 31M 27M sleep 0:40 0.26% python 10130 mailman 1 59 0 30M 26M sleep 0:15 0.03% python 10129 mailman 1 59 0 28M 24M sleep 0:19 0.10% python 10126 mailman 1 59 0 28M 25M sleep 1:07 0.59% python 10132 mailman 1 59 0 27M 24M sleep 1:00 0.46% python 10128 mailman 1 59 0 27M 24M sleep 0:16 0.01% python 10151 mailman 1 59 0 9516K 3852K sleep 0:05 0.01% python 10150 mailman 1 59 0 9500K 3764K sleep 0:00 0.00% python
On 6/23/08 8:55 PM, "Fletcher Cocquyt" <fcocquyt@stanford.edu> wrote:
Mike, many thanks for your (as always) very helpful response - I added the 1 liner to mm_cfg.py to increase the threads to 16. Now I am observing (via memory trend graphs) an acceleration of what looks like a memory leak - maybe from python - currently at 2.4
I am compiling the latest 2.5.2 to see if that helps - for now the workaround is to restart mailman occasionally.
(and yes the spamassassin checks are the source of the 4-10 second delay - now those happen in parallel x16 - so no spikes in the backlog...)
Thanks again
On 6/20/08 9:01 AM, "Mark Sapiro" <mark@msapiro.net> wrote:
Fletcher Cocquyt wrote:
Hi, I am observing periods of qfiles/in backlogs in the 400-600 message count range that take 1-2hours to clear with the standard Mailman 2.1.9 + Spamassassin (the vette log shows these messages process in an avg of ~10 seconds each)
Is Spamassassin invoked from Mailman or from the MTA before Mailman? If this plain Mailman, 10 seconds is a hugely long time to process a single post through IncomingRunner.
If you have some Spamassassin interface like
<http://sourceforge.net/tracker/index.php?func=detail&aid=640518&group_id=103>> &
atid=300103> that calls spamd from a Mailman handler, you might consider moving Spamassassin ahead of Mailman and using something like
<http://sourceforge.net/tracker/index.php?func=detail&aid=840426&group_id=103>> &
atid=300103> or just header_filter_rules instead.
Is there an easy way to parallelize what looks like a single serialized Mailman queue? I see some posts re: multi-slice but nothing definitive
See the section of Defaults.py headed with
##### # Qrunner defaults #####
In order to run multiple, parallel IncomingRunner processes, you can either copy the entire QRUNNERS definition from Defaults.py to mm_cfg.py and change
('IncomingRunner', 1), # posts from the outside world
to
('IncomingRunner', 4), # posts from the outside world
which says run 4 IncomingRunner processes, or you can just add something like
QRUNNERS[QRUNNERS.index(('IncomingRunner',1))] = ('IncomingRunner',4)
to mm_cfg.py. You can use any power of two for the number.
I would also like the option of working this into an overall loadbalancing scheme where I have multiple smtp nodes behind an F5 loadbalancer and the nodes share an NFS backend...
The following search will return some information.
<http://www.google.com/search?q=site%3Amail.python.org++inurl%3Amailman++%22l>> o
ad+balancing%22>
-- Fletcher Cocquyt Senior Systems Administrator Information Resources and Technology (IRT) Stanford University School of Medicine
Email: fcocquyt@stanford.edu Phone: (650) 724-7485