[Mailman-Users] Python process size grows 30x in 8 hours (memory leak?)
Fletcher Cocquyt
fcocquyt at stanford.edu
Tue Jul 1 18:19:10 CEST 2008
An update - I've upgraded to the latest stable python (2.5.2) and its made
no difference to the process growth:
Config:
Solaris 10 x86
Python 2.5.2
Mailman 2.1.9 (8 Incoming queue runners - the leak rate increases with this)
SpamAssassin 3.2.5
At this point I am looking for ways to isolate the suspected memory leak - I
am looking at using dtrace: http://blogs.sun.com/sanjeevb/date/200506
Any other tips appreciated!
Initial (immediately after a /etc/init.d/mailman restart):
last pid: 10330; load averages: 0.45, 0.19, 0.15
09:13:33
93 processes: 92 sleeping, 1 on cpu
CPU states: 98.6% idle, 0.4% user, 1.0% kernel, 0.0% iowait, 0.0% swap
Memory: 1640M real, 1160M free, 444M swap in use, 2779M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
10314 mailman 1 59 0 9612K 7132K sleep 0:00 0.35% python
10303 mailman 1 59 0 9604K 7080K sleep 0:00 0.15% python
10305 mailman 1 59 0 9596K 7056K sleep 0:00 0.14% python
10304 mailman 1 59 0 9572K 7036K sleep 0:00 0.14% python
10311 mailman 1 59 0 9572K 7016K sleep 0:00 0.13% python
10310 mailman 1 59 0 9572K 7016K sleep 0:00 0.13% python
10306 mailman 1 59 0 9556K 7020K sleep 0:00 0.14% python
10302 mailman 1 59 0 9548K 6940K sleep 0:00 0.13% python
10319 mailman 1 59 0 9516K 6884K sleep 0:00 0.15% python
10312 mailman 1 59 0 9508K 6860K sleep 0:00 0.12% python
10321 mailman 1 59 0 9500K 6852K sleep 0:00 0.14% python
10309 mailman 1 59 0 9500K 6852K sleep 0:00 0.13% python
10307 mailman 1 59 0 9500K 6852K sleep 0:00 0.13% python
10308 mailman 1 59 0 9500K 6852K sleep 0:00 0.12% python
10313 mailman 1 59 0 9500K 6852K sleep 0:00 0.12% python
After 8 hours:
last pid: 9878; load averages: 0.14, 0.12, 0.13
09:12:18
97 processes: 96 sleeping, 1 on cpu
CPU states: 97.2% idle, 1.2% user, 1.6% kernel, 0.0% iowait, 0.0% swap
Memory: 1640M real, 179M free, 2121M swap in use, 1100M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
10123 mailman 1 59 0 314M 311M sleep 1:57 0.02% python
10131 mailman 1 59 0 310M 307M sleep 1:35 0.01% python
10124 mailman 1 59 0 309M 78M sleep 0:45 0.10% python
10134 mailman 1 59 0 307M 81M sleep 1:27 0.01% python
10125 mailman 1 59 0 307M 79M sleep 0:42 0.01% python
10133 mailman 1 59 0 44M 41M sleep 0:14 0.01% python
10122 mailman 1 59 0 34M 30M sleep 0:43 0.39% python
10127 mailman 1 59 0 31M 27M sleep 0:40 0.26% python
10130 mailman 1 59 0 30M 26M sleep 0:15 0.03% python
10129 mailman 1 59 0 28M 24M sleep 0:19 0.10% python
10126 mailman 1 59 0 28M 25M sleep 1:07 0.59% python
10132 mailman 1 59 0 27M 24M sleep 1:00 0.46% python
10128 mailman 1 59 0 27M 24M sleep 0:16 0.01% python
10151 mailman 1 59 0 9516K 3852K sleep 0:05 0.01% python
10150 mailman 1 59 0 9500K 3764K sleep 0:00 0.00% python
On 6/23/08 8:55 PM, "Fletcher Cocquyt" <fcocquyt at stanford.edu> wrote:
> Mike, many thanks for your (as always) very helpful response - I added the 1
> liner to mm_cfg.py to increase the threads to 16.
> Now I am observing (via memory trend graphs) an acceleration of what looks
> like a memory leak - maybe from python - currently at 2.4
>
> I am compiling the latest 2.5.2 to see if that helps - for now the workaround
> is to restart mailman occasionally.
>
> (and yes the spamassassin checks are the source of the 4-10 second delay - now
> those happen in parallel x16 - so no spikes in the backlog...)
>
> Thanks again
>
>
> On 6/20/08 9:01 AM, "Mark Sapiro" <mark at msapiro.net> wrote:
>
>> Fletcher Cocquyt wrote:
>>
>>> Hi, I am observing periods of qfiles/in backlogs in the 400-600 message
>>> count range that take 1-2hours to clear with the standard Mailman 2.1.9 +
>>> Spamassassin (the vette log shows these messages process in an avg of ~10
>>> seconds each)
>>
>>
>> Is Spamassassin invoked from Mailman or from the MTA before Mailman? If
>> this plain Mailman, 10 seconds is a hugely long time to process a
>> single post through IncomingRunner.
>>
>> If you have some Spamassassin interface like
>>
<http://sourceforge.net/tracker/index.php?func=detail&aid=640518&group_id=103>>
&
>> atid=300103>
>> that calls spamd from a Mailman handler, you might consider moving
>> Spamassassin ahead of Mailman and using something like
>>
<http://sourceforge.net/tracker/index.php?func=detail&aid=840426&group_id=103>>
&
>> atid=300103>
>> or just header_filter_rules instead.
>>
>>
>>> Is there an easy way to parallelize what looks like a single serialized
>>> Mailman queue?
>>> I see some posts re: multi-slice but nothing definitive
>>
>>
>> See the section of Defaults.py headed with
>>
>> #####
>> # Qrunner defaults
>> #####
>>
>> In order to run multiple, parallel IncomingRunner processes, you can
>> either copy the entire QRUNNERS definition from Defaults.py to
>> mm_cfg.py
>> and change
>>
>> ('IncomingRunner', 1), # posts from the outside world
>>
>> to
>>
>> ('IncomingRunner', 4), # posts from the outside world
>>
>>
>> which says run 4 IncomingRunner processes, or you can just add
>> something like
>>
>> QRUNNERS[QRUNNERS.index(('IncomingRunner',1))] = ('IncomingRunner',4)
>>
>> to mm_cfg.py. You can use any power of two for the number.
>>
>>
>>> I would also like the option of working this into an overall loadbalancing
>>> scheme where I have multiple smtp nodes behind an F5 loadbalancer and the
>>> nodes share an NFS backend...
>>
>>
>> The following search will return some information.
>>
>>
<http://www.google.com/search?q=site%3Amail.python.org++inurl%3Amailman++%22l>>
o
>> ad+balancing%22>
--
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine
Email: fcocquyt at stanford.edu
Phone: (650) 724-7485
More information about the Mailman-Users
mailing list