[Mailman-Users] Python process size grows 30x in 8 hours (memory

Fletcher Cocquyt fcocquyt at stanford.edu
Wed Jul 2 01:20:42 CEST 2008




On 7/1/08 3:37 PM, "Mark Sapiro" <mark at msapiro.net> wrote:

> Fletcher Cocquyt wrote:
> 
>> Not finding a "leak" ref - save a irrelevant (for this runner issue) admindb
> 
> 
> Nothing has been done in Mailman to fix any memory leaks. As far as I
> know, nothing has been done to create any either.

Ok, thanks for confirming that - I will not prioritize a mailman
2.1.9->2.1.11 upgrade
 
> If there is a leak, it is most likely in the underlying Python and not
> a Mailman issue per se.

Agreed - hence my first priority to upgrade from python 2.4.x to 2.5.2 (the
latest on python.org) - but upgrading did not help this
 
> I am curious. You say this problem was exacerbated when you went from
> one IncomingRunner to eight (sliced) IncomingRunners. The
> IncomingRunner instances themselves should be processing fewer
> messages each, and I would expect them to leak less. The other runners
> are doing the same as before so I would expect them to be the same
> unless by solving your 'in' queue backlog, you're just handling a
> whole lot more messages.
 
> Also, in an 8 hour period, I would expect that RetryRunner and
> CommandRunner and, unless you are doing a lot of mail -> news
> gatewaying, NewsRunner to have done virtually nothing.
> 
> In this snapshot
> 
>    PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  10123 mailman    1  59    0  314M  311M sleep    1:57  0.02% python
>  10131 mailman    1  59    0  310M  307M sleep    1:35  0.01% python
>  10124 mailman    1  59    0  309M   78M sleep    0:45  0.10% python
>  10134 mailman    1  59    0  307M   81M sleep    1:27  0.01% python
>  10125 mailman    1  59    0  307M   79M sleep    0:42  0.01% python
>  10133 mailman    1  59    0   44M   41M sleep    0:14  0.01% python
>  10122 mailman    1  59    0   34M   30M sleep    0:43  0.39% python
>  10127 mailman    1  59    0   31M   27M sleep    0:40  0.26% python
>  10130 mailman    1  59    0   30M   26M sleep    0:15  0.03% python
>  10129 mailman    1  59    0   28M   24M sleep    0:19  0.10% python
>  10126 mailman    1  59    0   28M   25M sleep    1:07  0.59% python
>  10132 mailman    1  59    0   27M   24M sleep    1:00  0.46% python
>  10128 mailman    1  59    0   27M   24M sleep    0:16  0.01% python
>  10151 mailman    1  59    0 9516K 3852K sleep    0:05  0.01% python
>  10150 mailman    1  59    0 9500K 3764K sleep    0:00  0.00% python
> 
> Which processes correspond to which runners. And why are the two
> processes that have apparently done the least the ones that have grown
> the most.
> 
> In fact, why are none of these 15 PIDs the same as the ones from 8
> hours earlier, or was that snapshot actually from after the above were
> restarted?
Yes, I snapshot'ed the current leaked state, then restarted and snapped
those new PIDs to show the size diff.

Here is the current leaked state since the the cron 13:27 restart only 3
hours ago:
last pid: 20867;  load averages:  0.53,  0.47,  0.24
16:04:15
91 processes:  90 sleeping, 1 on cpu
CPU states: 99.1% idle,  0.3% user,  0.6% kernel,  0.0% iowait,  0.0% swap
Memory: 1640M real, 77M free, 1509M swap in use, 1699M swap free

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 24167 mailman    1  59    0  311M  309M sleep    0:28  0.02% python
 24158 mailman    1  59    0  308M  305M sleep    0:30  0.01% python
 24169 mailman    1  59    0  303M  301M sleep    0:28  0.01% python
 24165 mailman    1  59    0   29M   27M sleep    0:09  0.03% python
 24161 mailman    1  59    0   29M   27M sleep    0:12  0.07% python
 24164 mailman    1  59    0   28M   26M sleep    0:07  0.01% python
 24172 mailman    1  59    0   26M   24M sleep    0:04  0.01% python
 24160 mailman    1  59    0   26M   24M sleep    0:08  0.01% python
 24162 mailman    1  59    0   26M   23M sleep    0:10  0.01% python
 24166 mailman    1  59    0   26M   23M sleep    0:04  0.01% python
 24171 mailman    1  59    0   25M   23M sleep    0:04  0.02% python
 24163 mailman    1  59    0   24M   22M sleep    0:04  0.01% python
 24168 mailman    1  59    0   19M   17M sleep    0:03  0.02% python
 24170 mailman    1  59    0 9516K 6884K sleep    0:01  0.01% python
 24159 mailman    1  59    0 9500K 6852K sleep    0:00  0.00% python

And the mapping to the runners:
god at irt-smtp-02:mailman-2.1.11 4:16pm 66 # /usr/ucb/ps auxw | egrep mailman
| awk '{print $2 " " $11}'
24167 --runner=IncomingRunner:5:8
24165 --runner=BounceRunner:0:1
24158 --runner=IncomingRunner:7:8
24162 --runner=VirginRunner:0:1
24163 --runner=IncomingRunner:1:8
24166 --runner=IncomingRunner:0:8
24168 --runner=IncomingRunner:4:8
24169 --runner=IncomingRunner:2:8
24171 --runner=IncomingRunner:6:8
24172 --runner=IncomingRunner:3:8
24160 --runner=CommandRunner:0:1
24161 --runner=OutgoingRunner:0:1
24164 --runner=ArchRunner:0:1
24170 /bin/python
24159 /bin/python

Thanks for the analysis,
Fletcher




More information about the Mailman-Users mailing list