
Pmap shows its the heap
god@irt-smtp-02:in 8:08pm 64 # pmap 24167 24167: /bin/python /opt/mailman-2.1.9/bin/qrunner --runner=IncomingRunner:5:8 08038000 64K rwx-- [ stack ] 08050000 940K r-x-- /usr/local/stow/Python-2.5.2/bin/python 0814A000 172K rwx-- /usr/local/stow/Python-2.5.2/bin/python 08175000 312388K rwx-- [ heap ] CF210000 64K rwx-- [ anon ] <--many small libs --> total 318300K
Whether its a leak or not - we need to understand why the heap is growing and put a limit on its growth to avoid exausting memory and swapping into oblivion...
None of the lists seem too big: god@irt-smtp-02:lists 8:24pm 73 # du -sk */*pck | sort -nr | head | awk '{print $1}' 1392 1240 1152 1096 912 720 464 168 136 112
Researching python heap alloaction....
thanks
On 7/1/08 6:14 PM, "Mark Sapiro" <mark@msapiro.net> wrote:
Fletcher Cocquyt wrote:
Here is the current leaked state since the the cron 13:27 restart only 3 hours ago: last pid: 20867; load averages: 0.53, 0.47, 0.24 16:04:15 91 processes: 90 sleeping, 1 on cpu CPU states: 99.1% idle, 0.3% user, 0.6% kernel, 0.0% iowait, 0.0% swap Memory: 1640M real, 77M free, 1509M swap in use, 1699M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND 24167 mailman 1 59 0 311M 309M sleep 0:28 0.02% python 24158 mailman 1 59 0 308M 305M sleep 0:30 0.01% python 24169 mailman 1 59 0 303M 301M sleep 0:28 0.01% python 24165 mailman 1 59 0 29M 27M sleep 0:09 0.03% python 24161 mailman 1 59 0 29M 27M sleep 0:12 0.07% python 24164 mailman 1 59 0 28M 26M sleep 0:07 0.01% python 24172 mailman 1 59 0 26M 24M sleep 0:04 0.01% python 24160 mailman 1 59 0 26M 24M sleep 0:08 0.01% python 24162 mailman 1 59 0 26M 23M sleep 0:10 0.01% python 24166 mailman 1 59 0 26M 23M sleep 0:04 0.01% python 24171 mailman 1 59 0 25M 23M sleep 0:04 0.02% python 24163 mailman 1 59 0 24M 22M sleep 0:04 0.01% python 24168 mailman 1 59 0 19M 17M sleep 0:03 0.02% python 24170 mailman 1 59 0 9516K 6884K sleep 0:01 0.01% python 24159 mailman 1 59 0 9500K 6852K sleep 0:00 0.00% python
And the mapping to the runners: god@irt-smtp-02:mailman-2.1.11 4:16pm 66 # /usr/ucb/ps auxw | egrep mailman | awk '{print $2 " " $11}' 24167 --runner=IncomingRunner:5:8 24165 --runner=BounceRunner:0:1 24158 --runner=IncomingRunner:7:8 24162 --runner=VirginRunner:0:1 24163 --runner=IncomingRunner:1:8 24166 --runner=IncomingRunner:0:8 24168 --runner=IncomingRunner:4:8 24169 --runner=IncomingRunner:2:8 24171 --runner=IncomingRunner:6:8 24172 --runner=IncomingRunner:3:8 24160 --runner=CommandRunner:0:1 24161 --runner=OutgoingRunner:0:1 24164 --runner=ArchRunner:0:1 24170 /bin/python 24159 /bin/python
What are these last 2? Presumably they are the missing NewsRunner and RetryRunner, but what is the extra stuff in the ps output causing $11 to be the python command and not the runner option? And again, why are these two, which presumably have done nothing, seemingly the biggest.
Here's some additional thought.
Are you sure there is an actual leak? Do you know that if you just let them run, they don't reach some stable size and remain there as opposed to growing so large that they eventually throw a MemoryError exception and get restarted by mailmanctl.
If you allowed them to do that once, the MemoryError traceback might provide a clue.
Caveat! I know very little about Python's memory management. Some of what follows may be wrong.
Here's what I think - Python allocates more memory (from the OS) as needed to import additional modules and create new objects. Imports don't go away, but objects that are destroyed or become unreachable (eg a file object that is closed or a message object whose only reference gets assigned to something else) become candidates for garbage collection and ultimately the memory allocated to them is collected and reused (assuming no leaks). I *think* however, that no memory is ever actually freed back to the OS. Thus, Python processes that run for a long time can grow, but don't shrink.
Now, IncomingRunner in particular can get very large if large messages are arriving, even if those messages are ultimately not processed very far. Incoming runner reads the entire message into memory and then parses it into a message object which is even bigger than the message string. So, if someone happens to send a 100MB attachment to a list, IncomingRunner is going to need over 200MB before it ever looks at the message itself. This memory will later become available for other use within that IncomingRunner instance, but I don't think it is ever freed back to the OS.
Also, I see very little memory change between the 3 hour old snapshot above and the 8 hour old one from your prior post. If this is really a memory leak, I'd expect the 8 hour old ones to be perhaps twice as big as the 3 hour old ones.
Also, do you have any really big lists with big config.pck files. If so, Runners will grow as they instantiate that (those) big list(s).
-- Fletcher Cocquyt Senior Systems Administrator Information Resources and Technology (IRT) Stanford University School of Medicine
Email: fcocquyt@stanford.edu Phone: (650) 724-7485