Manipulate mailman in / out queue (from mailman-users)

[Redirecting part of this discussion from mailman-users to mailman-developers. Mark does a great job of explaining the gory details of the queue runners.]
On Oct 16, 2012, at 09:23 PM, Mark Sapiro wrote:
I'd like to begin to explore ways to make this automatically more manageable, so that when problems occur (e.g. upstream SMTPd not responding or slow), Mailman can recognize and response more efficiently to the problem.
The first question to ask is whether this mostly affects the outgoing queue, or whether other queues *in practice* can suffer the same overloading? I suspect that the incoming queue could get filled quickly if Mailman is slow to handling an onslaught of new messages. My sense is that in general, it's the outgoing queue that gives people the most problems though.
The switchboard in Mailman 3 is largely the same as in Mailman 2 (modulo updating the code to more modern Pythons).
Before describing my own thoughts, I'd like to open it up and get a sense of your experiences, and strategies you think might help manage big queues.
Cheers, -Barry

Hello Barry,
I have seen both in and out queue get overloaded in our environment. We even have a Nagios monitoring plugin to monitor mailman's in and out queue size and sends alerts when it reaches a threshold.
If we get mailbomb - for example, lists get hijacked and spammed, lists receive automated run away alerts (sent by programs that do not have alert interval). Under those circumstances, a few thousand messages can be pumped into 'in' queue in very short time.
For situation like this, I'd find the message pattern, stop mailman service, use a shell script to scan the 'in' queue and remove these messages. Sometimes we also need to block the sending machine on incoming MTA to stop the messages coming to mailman.
It would be helpful to have a utility tool to get status of the in queue - what's the top sender for example and a tool to remove them.
The outgoing queue gets piled up when outgoing MTA has trouble to accept messages. In these cases, nothing to remove from the queue. Manually manipulating out queue's size becomes necessary. I think tools or automation to help quickly and reliably address the problem before it advances to a service outage would be very important improvement!
Xueshan
On Wed, Oct 17, 2012 at 7:37 AM, Barry Warsaw <barry@list.org> wrote:
-- Xueshan Feng Infrastructure Delivery Group, IT Services Stanford University

Hello Barry,
I have seen both in and out queue get overloaded in our environment. We even have a Nagios monitoring plugin to monitor mailman's in and out queue size and sends alerts when it reaches a threshold.
If we get mailbomb - for example, lists get hijacked and spammed, lists receive automated run away alerts (sent by programs that do not have alert interval). Under those circumstances, a few thousand messages can be pumped into 'in' queue in very short time.
For situation like this, I'd find the message pattern, stop mailman service, use a shell script to scan the 'in' queue and remove these messages. Sometimes we also need to block the sending machine on incoming MTA to stop the messages coming to mailman.
It would be helpful to have a utility tool to get status of the in queue - what's the top sender for example and a tool to remove them.
The outgoing queue gets piled up when outgoing MTA has trouble to accept messages. In these cases, nothing to remove from the queue. Manually manipulating out queue's size becomes necessary. I think tools or automation to help quickly and reliably address the problem before it advances to a service outage would be very important improvement!
Xueshan
On Wed, Oct 17, 2012 at 7:37 AM, Barry Warsaw <barry@list.org> wrote:
-- Xueshan Feng Infrastructure Delivery Group, IT Services Stanford University
participants (2)
-
Barry Warsaw
-
Xueshan Feng