[Bug 1082308] [NEW] The qrunner-master lock file causes issues when running clustered
Public bug reported: Hi It is possible to run mailman in a failover or load balancing cluster, see: http://wiki.list.org/pages/viewpage.action?pageId=4030621 When running a cluster, it is crucial to use: * a shared directory for archive data * a shared directory for locks * separate directories for each qrunner This is possible to implement by setting the directories in mm_cfg.py, for example like this (where <host> is a host name): VAR_PREFIX = '<shared dir>' LIST_DATA_DIR = os.path.join(VAR_PREFIX, 'lists') LOCK_DIR = os.path.join(VAR_PREFIX, 'locks') DATA_DIR = os.path.join(VAR_PREFIX, 'data') SPAM_DIR = os.path.join(VAR_PREFIX, 'spam') LOG_DIR = os.path.join(VAR_PREFIX, 'logs-<host>') PUBLIC_ARCHIVE_FILE_DIR = os.path.join(VAR_PREFIX, 'archives', 'public') PRIVATE_ARCHIVE_FILE_DIR = os.path.join(VAR_PREFIX, 'archives', 'private') # For qfiles and logs, <dir>-<host> is used to avoid conflicts QUEUE_DIR = os.path.join(VAR_PREFIX, 'qfiles-<host>') INQUEUE_DIR = os.path.join(QUEUE_DIR, 'in') OUTQUEUE_DIR = os.path.join(QUEUE_DIR, 'out') CMDQUEUE_DIR = os.path.join(QUEUE_DIR, 'commands') BOUNCEQUEUE_DIR = os.path.join(QUEUE_DIR, 'bounces') NEWSQUEUE_DIR = os.path.join(QUEUE_DIR, 'news') ARCHQUEUE_DIR = os.path.join(QUEUE_DIR, 'archive') SHUNTQUEUE_DIR = os.path.join(QUEUE_DIR, 'shunt') VIRGINQUEUE_DIR = os.path.join(QUEUE_DIR, 'virgin') BADQUEUE_DIR = os.path.join(QUEUE_DIR, 'bad') RETRYQUEUE_DIR = os.path.join(QUEUE_DIR, 'retry') MAILDIR_DIR = os.path.join(QUEUE_DIR, 'maildir') Unfortunately, the master-qrunner lock causes problem with this setup. mailmanctl -s starts even if there is a master-qrunner file (provided that there is no running mailmanctl on the host), making it possible to get the service up and running on more than one host. Once a day however, mailmanctl controls the lock. If it does not have it, it shuts down. If you are running a cluster, at least one of the nodes will not have the lock, and the service will be shut down on that node. To solve this, I propose that the the LOCKFILE name in mailmanctl becomes configurable, so instead of having: LOCKFILE = os.path.join(mm_cfg.LOCK_DIR, 'master-qrunner') Have: LOCKFILE = os.path.join(mm_cfg.LOCK_DIR, mm_cfg.QRUNNER_LOCK_FILE) Then add LOCKFILE = 'master-qrunner' in Defaults.py. This would make it easy to have individual qrunner master lock files for each node in a cluster. ** Affects: mailman Importance: Undecided Status: New ** Tags: cluster -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1082308 Title: The qrunner-master lock file causes issues when running clustered To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1082308/+subscriptions
** Branch linked: lp:mailman/2.2 ** Branch linked: lp:mailman/2.1 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1082308 Title: The qrunner-master lock file causes issues when running clustered To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1082308/+subscriptions
** Changed in: mailman Importance: Undecided => Low ** Changed in: mailman Status: New => Fix Committed ** Changed in: mailman Milestone: None => 2.1.16 ** Changed in: mailman Assignee: (unassigned) => Mark Sapiro (msapiro) -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1082308 Title: The qrunner-master lock file causes issues when running clustered To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1082308/+subscriptions
** Changed in: mailman Status: Fix Committed => Fix Released ** Changed in: mailman Milestone: 2.1.16 => 2.1.16rc1 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1082308 Title: The qrunner-master lock file causes issues when running clustered To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1082308/+subscriptions
participants (3)
-
David Westlund
-
Launchpad Bug Tracker
-
Mark Sapiro