Mailman 2.1.23: Issues with mailmanctl restart
Hello,
I have seen recently issues with mailmanctl restart: From time to time there seems to be some kind of race condition between shutdown and startup. Inside qrunner log it looks something like — but not started.
Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:21 2017 (30368) BounceRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:21 2017 (30368) BounceRunner qrunner exiting.
Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner exiting.
Jun 18 02:20:21 2017 (30370) IncomingRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:22 2017 (30370) IncomingRunner qrunner exiting.
Jun 18 02:20:21 2017 (30373) VirginRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:22 2017 (30367) ArchRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:22 2017 (30367) ArchRunner qrunner exiting.
Jun 18 02:20:22 2017 (30369) CommandRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:22 2017 (30373) VirginRunner qrunner exiting.
Jun 18 02:20:22 2017 (30369) CommandRunner qrunner exiting.
Jun 18 02:20:22 2017 (30371) NewsRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:24 2017 (30371) NewsRunner qrunner exiting.
Jun 18 02:20:22 2017 (30374) RetryRunner qrunner caught SIGTERM.
Stopping.
Jun 18 02:20:25 2017 (30374) RetryRunner qrunner exiting.
(the restart is getting triggered by some logrotate-magic but this is also happening from time to time when executing it directly)
I cannot see any segfault or exception by now and wondering whether only me is seeing this kind of issue?
Regards, Frank
On 06/20/2017 07:12 AM, Frank Lanitz wrote:
Hello,
I have seen recently issues with mailmanctl restart: From time to time there seems to be some kind of race condition between shutdown and startup. Inside qrunner log it looks something like — but not started.
Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:21 2017 (30368) BounceRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:21 2017 (30368) BounceRunner qrunner exiting. Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner exiting. Jun 18 02:20:21 2017 (30370) IncomingRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30370) IncomingRunner qrunner exiting. Jun 18 02:20:21 2017 (30373) VirginRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30367) ArchRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30367) ArchRunner qrunner exiting. Jun 18 02:20:22 2017 (30369) CommandRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30373) VirginRunner qrunner exiting. Jun 18 02:20:22 2017 (30369) CommandRunner qrunner exiting. Jun 18 02:20:22 2017 (30371) NewsRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:24 2017 (30371) NewsRunner qrunner exiting. Jun 18 02:20:22 2017 (30374) RetryRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:25 2017 (30374) RetryRunner qrunner exiting.
The above looks like normal entries for a "stop" except for a missing "Master watcher caught SIGTERM. Exiting.". Are you saying that "restart" acts like "stop"?
(the restart is getting triggered by some logrotate-magic but this is also happening from time to time when executing it directly)
The logrotate script should do "reopen", not "restart".
I cannot see any segfault or exception by now and wondering whether only me is seeing this kind of issue?
I have not seen any reports of "restart" not restarting.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 06/20/2017 07:12 AM, Frank Lanitz wrote:
I have seen recently issues with mailmanctl restart: From time to time there seems to be some kind of race condition between shutdown and startup. Inside qrunner log it looks something like — but not started.
Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:21 2017 (30368) BounceRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:21 2017 (30368) BounceRunner qrunner exiting. Jun 18 02:20:21 2017 (30372) OutgoingRunner qrunner exiting. Jun 18 02:20:21 2017 (30370) IncomingRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30370) IncomingRunner qrunner exiting. Jun 18 02:20:21 2017 (30373) VirginRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30367) ArchRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30367) ArchRunner qrunner exiting. Jun 18 02:20:22 2017 (30369) CommandRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:22 2017 (30373) VirginRunner qrunner exiting. Jun 18 02:20:22 2017 (30369) CommandRunner qrunner exiting. Jun 18 02:20:22 2017 (30371) NewsRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:24 2017 (30371) NewsRunner qrunner exiting. Jun 18 02:20:22 2017 (30374) RetryRunner qrunner caught SIGTERM. Stopping. Jun 18 02:20:25 2017 (30374) RetryRunner qrunner exiting.
Also, in addition to my prior reply, note that "restart" should be sending SIGINT to the runners, not SIGTERM
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Greetings.
Today successfully upgraded mailman instance running on my server from Mailman-2.1.23 to Mailman-2.1.24 without any issues so far.
The only action taken was to run twice command bin/check_perms -f in order to fix 226 ownership errors
- "root" been replaced by "mailman".
Best regards.
Tom - SP2L
Sent from Xperia Z1 with AquaMail http://www.aqua-mail.com
participants (3)
-
Frank Lanitz
-
Mark Sapiro
-
SP2L