
On 10/31/2016 10:15 AM, Dennis Putnam wrote:
Hi Mark,
Thanks for the reply.
You are correct, there is nothing in the error log and this is all that is in the qrunner log at that time:
Oct 20 06:43:20 2016 (3826) Master qrunner detected subprocess exit (pid: 3842, sig: 6, sts: None, class: VirginRunner, slice: 1/1) [restarting] Oct 20 06:43:21 2016 (30064) VirginRunner qrunner started.
This can ultimately become more serious. If VirginRunner continues to die at times, after the 10th time, it won't automatically restart and you will need to manually restart Mailman.
For that particular day mailman was idle so there was no mailman log. There was no list activity. That would seem to support your suggestion that this is not a mailman problem but rather mailman is a victim of some OS or external issue. Odd however, that it is always mailman that is effected and yes, the dumps seem to be similar, although I did not try to do a diff on any of them (that probably wouldn't mean much anyway).
In that case, all VirginRunner was doing was waking up, checking it's queue, finding it empty and going back to sleep which it and most of the other runners except RetryRunner do every QRUNNER_SLEEP_TIME (1 second default).
Since all the runners are basically doing the same loop, this may not always affect VirginRunner.
The bottom line is if you can't figure out why this is happening and stop it (and I don't think I can help much here) you'll have to monitor to be sure no runner is not restarted because of the restart limit and restart Mailman if necessary or maybe periodically whether needed or not.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan