METU E-List Admin wrote:
We are using Mailman 2.1.13 from Debian repositories (which is the latest version for squeeze). We are facing late mail delivery problems and after some research, we think we managed to overcome these problems by changing some configuration options.
I don't know what you did or what the issue was, but delivery delays in Mailman are usually caused by OutgoingRunner being backlogged and fixed by tuning the MTA to speed up acceptance of list mail.
Although Mailman seems to running fine, we have observed an issue.
The issue is related with IncomingRunner. This qrunner process takes up approx. 1 GB RAM and sometimes causes high loads. We couldn't determine what causes high RAM and CPU usages.
The question is; this is an expected result of Mailman process? If not, how can we determine the root cause of this problem?
I'd need more information to speak to high CPU usage, but there are a couple of factors that cause runners to grow large
The basic issue is with Python's memory management. If a Python process grows large because there is some object that requires a lot of memory, when that object is no longer referenced, the memory is freed within the Python process and is available for reuse, but unused memory is not freed from the process and returned to the OS. Thus, a Python process can grow to accommodate large objects, but even when that memory is freed within the process, the process doesn't shrink.
Thus, if IncomingRunner processes a very large message, it will grow large and remain so until restarted. This can happen with some spam, even though the spam message is ultimately discarded.
Another issue is with Mailman prior to 2.1.15 and can affect all runners.
In Mailman prior to 2.1.15, the runners kept an in memory list cache. For various technical reasons, cached list objects were never freed from the cache. Thus, runners would gradually grow until they held a copy of every list object. If you have lots of lists, or even a few lists with lots of members, runners could grow quite large for this reason.
Note that in both these scenarios, even though the process is large, if memory is scarce in the server, much of this memory will be only on backing store and will not have a significant impact on server RAM usage. See the FAQ at http://wiki.list.org/x/94A9.
As far as CPU is concerned, normally, OutgoingRunner uses the most (at least in a VERPed environment) and ArchRunner uses some and the others, not much.
ps showing CPU for a smallish installation that's been
running for 11 days.
UID PID PPID C STIME TTY TIME CMD mailman 27691 1 0 Nov02 ? 00:00:00 /usr/bin/python2.6 /usr/local/mailman/bin/mailmanctl -s -q start mailman 27692 27691 0 Nov02 ? 00:01:14 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=ArchRunner:0:1 -s mailman 27693 27691 0 Nov02 ? 00:00:03 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=BounceRunner:0:1 -s mailman 27694 27691 0 Nov02 ? 00:00:03 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=CommandRunner:0:1 -s mailman 27695 27691 0 Nov02 ? 00:00:07 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s mailman 27696 27691 0 Nov02 ? 00:00:02 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=NewsRunner:0:1 -s mailman 27697 27691 0 Nov02 ? 00:02:37 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s mailman 27698 27691 0 Nov02 ? 00:00:05 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=VirginRunner:0:1 -s mailman 27699 27691 0 Nov02 ? 00:00:00 /usr/bin/python2.6 /usr/local/mailman/bin/qrunner --runner=RetryRunner:0:1 -s