[Bug 1082308] [NEW] The qrunner-master lock file causes issues when running clustered
David Westlund
davidw at axis.com
Fri Nov 23 11:28:09 CET 2012
Public bug reported:
Hi
It is possible to run mailman in a failover or load balancing cluster, see:
http://wiki.list.org/pages/viewpage.action?pageId=4030621
When running a cluster, it is crucial to use:
* a shared directory for archive data
* a shared directory for locks
* separate directories for each qrunner
This is possible to implement by setting the directories in mm_cfg.py, for example like this (where <host> is a host name):
VAR_PREFIX = '<shared dir>'
LIST_DATA_DIR = os.path.join(VAR_PREFIX, 'lists')
LOCK_DIR = os.path.join(VAR_PREFIX, 'locks')
DATA_DIR = os.path.join(VAR_PREFIX, 'data')
SPAM_DIR = os.path.join(VAR_PREFIX, 'spam')
LOG_DIR = os.path.join(VAR_PREFIX, 'logs-<host>')
PUBLIC_ARCHIVE_FILE_DIR = os.path.join(VAR_PREFIX, 'archives', 'public')
PRIVATE_ARCHIVE_FILE_DIR = os.path.join(VAR_PREFIX, 'archives', 'private')
# For qfiles and logs, <dir>-<host> is used to avoid conflicts
QUEUE_DIR = os.path.join(VAR_PREFIX, 'qfiles-<host>')
INQUEUE_DIR = os.path.join(QUEUE_DIR, 'in')
OUTQUEUE_DIR = os.path.join(QUEUE_DIR, 'out')
CMDQUEUE_DIR = os.path.join(QUEUE_DIR, 'commands')
BOUNCEQUEUE_DIR = os.path.join(QUEUE_DIR, 'bounces')
NEWSQUEUE_DIR = os.path.join(QUEUE_DIR, 'news')
ARCHQUEUE_DIR = os.path.join(QUEUE_DIR, 'archive')
SHUNTQUEUE_DIR = os.path.join(QUEUE_DIR, 'shunt')
VIRGINQUEUE_DIR = os.path.join(QUEUE_DIR, 'virgin')
BADQUEUE_DIR = os.path.join(QUEUE_DIR, 'bad')
RETRYQUEUE_DIR = os.path.join(QUEUE_DIR, 'retry')
MAILDIR_DIR = os.path.join(QUEUE_DIR, 'maildir')
Unfortunately, the master-qrunner lock causes problem with this setup.
mailmanctl -s starts even if there is a master-qrunner file (provided
that there is no running mailmanctl on the host), making it possible to
get the service up and running on more than one host. Once a day
however, mailmanctl controls the lock. If it does not have it, it shuts
down. If you are running a cluster, at least one of the nodes will not
have the lock, and the service will be shut down on that node.
To solve this, I propose that the the LOCKFILE name in mailmanctl becomes configurable, so instead of having:
LOCKFILE = os.path.join(mm_cfg.LOCK_DIR, 'master-qrunner')
Have:
LOCKFILE = os.path.join(mm_cfg.LOCK_DIR, mm_cfg.QRUNNER_LOCK_FILE)
Then add LOCKFILE = 'master-qrunner' in Defaults.py.
This would make it easy to have individual qrunner master lock files for
each node in a cluster.
** Affects: mailman
Importance: Undecided
Status: New
** Tags: cluster
--
You received this bug notification because you are a member of Mailman
Coders, which is subscribed to GNU Mailman.
https://bugs.launchpad.net/bugs/1082308
Title:
The qrunner-master lock file causes issues when running clustered
To manage notifications about this bug go to:
https://bugs.launchpad.net/mailman/+bug/1082308/+subscriptions
More information about the Mailman-coders
mailing list