can't use mailmanctl -s start
![](https://secure.gravatar.com/avatar/e8182135be0245df69df7ddf7f70856a.jpg?s=120&d=mm&r=g)
I don't know how to properly set up my system so that mailman restarts after a crash. I've just experienced one a few minutes ago, and here's the situation
# /etc/init.d/mailman start Starting list server: mailman. The master qrunner lock could not be acquired. It appears as though there is a stale master qrunner lock. Try re-running mailmanctl with the -s flag.
# /home/mailman/bin/mailmanctl -s start Starting Mailman's master qrunner. Traceback (most recent call last): File "/home/mailman/bin/mailmanctl", line 492, in ? main() File "/home/mailman/bin/mailmanctl", line 364, in main lock._transfer_to(pid) File "/home/mailman/Mailman/LockFile.py", line 357, in _transfer_to os.link(self.__lockfile, self.__tmpfname) OSError: [Errno 2] No such file or directory
# ps aux | grep mailman mailman 924 0.0 0.3 4440 2968 pts/1 S 11:22 0:00 /usr/local/bin/python2.1 /home/mailman/bin/mailmanctl -s start
So the process has failed but is still running?
By the way, I've set things up so that :
in mm_cfg.py: PIDFILE = '/var/run/mailman/mailman.pid'
/var/run/mailman/ is emptied as startup (I've checked, it's empty now, and permissions seem to be correct: drwxr-xr-x 2 mailman mailman 4096 jan 9 11:15 /var/run/mailman/
I've got a /etc/init.d/mailman script that just does not much more than call /home/mailman/bin/mailmanctl start
(Using the most recent CVS version, I prefer to post this to the -dev list)
-- Fil
![](https://secure.gravatar.com/avatar/a930430c7f9705b71a65f341c4191a2b.jpg?s=120&d=mm&r=g)
"F" == Fil <fil@rezo.net> writes:
F> I don't know how to properly set up my system so that mailman
F> restarts after a crash. I've just experienced one a few minutes
F> ago, and here's the situation
This is a tricky bit of code, where it's trying to transfer ownership of a lock file from the parent to a child process. I thought the code was race condition free, but it's very possible I've overlooked something. I'll stare at the code and try to reproduce it.
| # /home/mailman/bin/mailmanctl -s start
| Starting Mailman's master qrunner.
| Traceback (most recent call last):
>> le "/home/mailman/bin/mailmanctl", line 492, in ? main() le
>> "/home/mailman/bin/mailmanctl", line 364, in main
>> lock._transfer_to(pid) le "/home/mailman/Mailman/LockFile.py", line
>> 357, in _transfer_to os.link(self.__lockfile, self.__tmpfname)
F> OSError: [Errno 2] No such file or directory
Note that it's not choking on the pid file, it's choking on the lock file that the parent is supposed to own a lock on.
F> So the process has failed but is still running?
Probably the child is still running, but the parent threw the exception. The child ought to be blocked in the _take_possession() call, and there should be no locks/master-qrunner lockfile.
F> I've got a /etc/init.d/mailman script that just does not much
F> more than call /home/mailman/bin/mailmanctl start
Are you using misc/mailman as your init.d script? (I don't think it enters into the picture here).
F> (Using the most recent CVS version, I prefer to post this to
F> the -dev list)
The right thing to do!
I'll investigate some more. -Barry
![](https://secure.gravatar.com/avatar/a930430c7f9705b71a65f341c4191a2b.jpg?s=120&d=mm&r=g)
[Fil describes problems with the mailmanctl -s flag...]
This is fixed now in CVS I believe. There was a subtle (is there any other kind? :) race condition and a reference counting issue in the handoff of the lock from the parent to the child. Pesky bugger to track down, but I think I've got it now.
Thanks, -Barry
participants (2)
-
barry@zope.com
-
Fil