[ mailman-Bugs-1168999 ] OutgoingRunner gets in expensive recursive loop

Bugs item #1168999, was opened at 2005-03-23 18:30 Message generated for change (Comment added) made by kjd You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1168999...
Category: mail delivery Group: 2.1 beta Status: Open Resolution: None Priority: 5 Submitted By: Kim Davies (kjd) Assigned to: Nobody/Anonymous (nobody) Summary: OutgoingRunner gets in expensive recursive loop
Initial Comment: I have had a problem spring up the past few weeks where the OutgoingRunner gets in a loop which effectively brings down the machine by spiking the CPU to 99%. Running "strace" on the process I see it constantly deleting and reimplanting the same queue file in qfiles/out/ over and over, many times per second.
The initial problem inurred with a 2.1.2 install, but installing 2.1.6b4 shows the same.
Unfortunately the problem is somewhat ephemeral when trying to diagnose it - if I manage to kill the OutgoingRunner between a read and write, the queue file gets lost the the problem disappears for a while.
I don't know if it is useful, but attached is the strace output of a complete read/write cycle. I haven't had the opportunity to further debug it (by stepping through the python) as currently I am not in this state. I am not sure how long it will be until it is triggered again, but it has happened about 4 times in the past two weeks. It has never occured before this over 3 years.
I consider this issue fairly problematic - the machine becomes unusable when it reaches this state due to CPU exhaustion.
Any tips of helping isolate the problem are welcome. I have modified mailmanctl to run all queuerunners with a verbose flag, so next time maybe there will be useful information logged.
----------------------------------------------------------------------
Comment By: Kim Davies (kjd)
Date: 2005-04-26 15:30
Message: Logged In: YES user_id=168657
I have managed to captures a number of qfiles that are causing this phenomenon (which is recurring more often the past few weeks). They have the following properties:
- They are all "Post by non-member to a members-only list " responses to spam that has gone to a moderated list. - They come from non-existant domains.
Here is a sample dump of one of the .db files from the queue that is looping:
$ /usr/local/mailman/bin/dumpdb 1114500259.9807329+a1518ee474d8edb0e83615df60270885165d5f83.db { 'deliver_after': 1114503867.0899999, 'deliver_until': 1114932267.0899999, 'lang': 'en', 'last_recip_count': 1, 'listname': 'ga', 'nodecorate': 1, 'original_sender': 'ga-bounces@lists.centr.org', 'personalize': 1, 'pipeline': [], 'received_time': 1114500259.9807329, 'recips': ['sidfks@sklfislxkd.com'], 'reduced_list_headers': 1, 'verp': 1, 'version': 3}
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1168999...
participants (1)
-
SourceForge.net