Unable to start mailman due to time travel
I have mailman-2.1.26 installed on my Gentoo 17.0 application server, which I just rebooted to update from kernel 4.15.15 to 4.16.13. After rebooting, mailman would not start, logging the following errors:
Jul 04 14:43:58 2018 mailmanctl(15743): Traceback (most recent call last): Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 556, in <module> Jul 04 14:43:58 2018 mailmanctl(15743): main() Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 391, in main Jul 04 14:43:58 2018 mailmanctl(15743): lock = acquire_lock(force) Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 215, in acquire_lock Jul 04 14:43:58 2018 mailmanctl(15743): lock = acquire_lock_1(force) Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 203, in acquire_lock_1 Jul 04 14:43:58 2018 mailmanctl(15743): if not force or qrunner_state(): Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 187, in qrunner_state Jul 04 14:43:58 2018 mailmanctl(15743): os.kill(pid, 0) Jul 04 14:43:58 2018 mailmanctl(15743): OSError : [Errno 1] Operation not permitted
This looks like it's failing to acquire a lock file. Investigating, I discovered this:
minbar:root:/var/lib/mailman/logs:18 # ls -l /var/lib/mailman/locks total 8 -rw-rw-r-- 2 mailman mailman 49 Jul 5 2018 master-qrunner -rw-rw-r-- 2 mailman mailman 49 Jul 5 2018 master-qrunner.minbar.4623
The server itself is pretty clear that today is July 4:
minbar:root:/var/lib/mailman/logs:23 # date Wed Jul 4 14:50:09 EDT 2018
After I deleted the errant lock files, mailman started up without hesitation ... and AGAIN created lock files timestamped in the future:
minbar:root:/var/lib/mailman/logs:22 # ls -l /var/lib/mailman/locks total 8 -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner.minbar.15895
Does anyone have any idea what's going on here?
-- Phil Stracchino Babylon Communications phils@caerllewys.net phil@co.ordinate.org Landline: +1.603.293.8485 Mobile: +1.603.998.6958
On 7/4/18 11:52 AM, Phil Stracchino wrote:
I have mailman-2.1.26 installed on my Gentoo 17.0 application server, which I just rebooted to update from kernel 4.15.15 to 4.16.13. After rebooting, mailman would not start, logging the following errors:
Jul 04 14:43:58 2018 mailmanctl(15743): Traceback (most recent call last): Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 556, in <module> Jul 04 14:43:58 2018 mailmanctl(15743): main() Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 391, in main Jul 04 14:43:58 2018 mailmanctl(15743): lock = acquire_lock(force) Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 215, in acquire_lock Jul 04 14:43:58 2018 mailmanctl(15743): lock = acquire_lock_1(force) Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 203, in acquire_lock_1 Jul 04 14:43:58 2018 mailmanctl(15743): if not force or qrunner_state(): Jul 04 14:43:58 2018 mailmanctl(15743): File "bin/mailmanctl", line 187, in qrunner_state Jul 04 14:43:58 2018 mailmanctl(15743): os.kill(pid, 0) Jul 04 14:43:58 2018 mailmanctl(15743): OSError : [Errno 1] Operation not permitted
This looks like it's failing to acquire a lock file. Investigating, I discovered this:
There was probably a stale lock due to Mailman not being cleanly terminated upon reboot. The Operation not permitted is an attempt to determine the status of the PID that created the lock by sending it a signal 0. The error may be a Gentoo thing or 'mailmanctl start' may need to run as root if it wasn't.
minbar:root:/var/lib/mailman/logs:18 # ls -l /var/lib/mailman/locks total 8 -rw-rw-r-- 2 mailman mailman 49 Jul 5 2018 master-qrunner -rw-rw-r-- 2 mailman mailman 49 Jul 5 2018 master-qrunner.minbar.4623
The server itself is pretty clear that today is July 4:
minbar:root:/var/lib/mailman/logs:23 # date Wed Jul 4 14:50:09 EDT 2018
After I deleted the errant lock files, mailman started up without hesitation ... and AGAIN created lock files timestamped in the future:
minbar:root:/var/lib/mailman/logs:22 # ls -l /var/lib/mailman/locks total 8 -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner.minbar.15895
This is all normal and expected. The time stamp on locks is set to that future time when the lock expires.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 07/04/18 15:10, Mark Sapiro wrote:
On 7/4/18 11:52 AM, Phil Stracchino wrote:
This looks like it's failing to acquire a lock file. Investigating, I discovered this:
There was probably a stale lock due to Mailman not being cleanly terminated upon reboot. The Operation not permitted is an attempt to determine the status of the PID that created the lock by sending it a signal 0. The error may be a Gentoo thing or 'mailmanctl start' may need to run as root if it wasn't.
Yup, actually I'm pretty sure I know how the stale lock file happened ... I had to interrupt boot a couple of times because this is a Dell R610 server, and it has an annoying habit of disbelieving in my KVM switch during boot. It can be a bit of a headache getting it to acknowledge that it actually has a keyboard connected in time to select the kernel I want i grub.
After I deleted the errant lock files, mailman started up without hesitation ... and AGAIN created lock files timestamped in the future:
minbar:root:/var/lib/mailman/logs:22 # ls -l /var/lib/mailman/locks total 8 -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner -rw-rw-r-- 2 mailman mailman 50 Jul 5 2018 master-qrunner.minbar.15895
This is all normal and expected. The time stamp on locks is set to that future time when the lock expires.
AH! I did not know that. Mystery resolved. Thank you. :)
-- Phil Stracchino Babylon Communications phils@caerllewys.net phil@co.ordinate.org Landline: +1.603.293.8485 Mobile: +1.603.998.6958
participants (2)
-
Mark Sapiro
-
Phil Stracchino