Mails to list missing in clustered environment.
Hi,
I've managed to set up Mailman now (thanks Igor and Mark for all the help) clustered and with the vhost patch. But I'm having an odd problem. I'm not seeing a pattern to this, but some messages go through Postfix and are delivered to mailman and then nothing. Nothing in the Mailman logs and the message just disappears.
Those that do go through are delivered to all members of the list and are archived. They are also logged as per normal.
I was wondering whether the shared qfiles might have something to do with it? I've got the section below in my mm_cfg.py: QRUNNERS = [ ('ArchRunner', 5,0,5), # messages for the archiver ('BounceRunner', 5,0,5), # for processing the qfile/bounces directory ('CommandRunner', 5,0,5), # commands and bounces from the outside world ('IncomingRunner', 5,0,5), # posts from the outside world ('NewsRunner', 5,0,5), # outgoing messages to the nntpd ('OutgoingRunner', 5,0,5), # outgoing messages to the smtpd ('VirginRunner', 5,0,5), # internally crafted (virgin birth) messages ('RetryRunner', 5,0,5), # retry temporarily failed deliveries ]
The other 2 servers I've installed so far (the 2 mail gateways) are 5,3,5 and 5,4,5. The other 2 web servers don't have mailman installed yet. Is that config correct for 5 servers?
-- Don't just do something...sit there!
Guy wrote:
I've managed to set up Mailman now (thanks Igor and Mark for all the help) clustered and with the vhost patch. But I'm having an odd problem. I'm not seeing a pattern to this, but some messages go through Postfix and are delivered to mailman and then nothing. Nothing in the Mailman logs and the message just disappears.
What does the Postfix maillog entry say?
Those that do go through are delivered to all members of the list and are archived. They are also logged as per normal.
I was wondering whether the shared qfiles might have something to do with it? I've got the section below in my mm_cfg.py: QRUNNERS = [ ('ArchRunner', 5,0,5), # messages for the archiver ('BounceRunner', 5,0,5), # for processing the qfile/bounces directory ('CommandRunner', 5,0,5), # commands and bounces from the outside world ('IncomingRunner', 5,0,5), # posts from the outside world ('NewsRunner', 5,0,5), # outgoing messages to the nntpd ('OutgoingRunner', 5,0,5), # outgoing messages to the smtpd ('VirginRunner', 5,0,5), # internally crafted (virgin birth) messages ('RetryRunner', 5,0,5), # retry temporarily failed deliveries ]
The other 2 servers I've installed so far (the 2 mail gateways) are 5,3,5 and 5,4,5. The other 2 web servers don't have mailman installed yet. Is that config correct for 5 servers?
No.
I suspect the messages are in qfiles/in/ and not being processed by any IncomingRunner
Assuming the method from <http://mail.python.org/pipermail/mailman-users/2008-March/060753.html>, the entries in the QRUNNERS list are tuples of (name, count, this_machine_number, number_of_machines)
name is the qrunner name; you're good there.
count is the total number of slices for that queue. This should be a power of 2 multiple of number_of_machines,
this_machine_number is the number of this machine; a number from 0 to number_of_machines - 1
number_of_machines is the number of machines on which mailmanctl and qrunners are running
I'm not clear on what number_of_machines is in your case, but it certainly isn't more than 3. If mailmanctl and qrunners are only running on this one machine, its entries should be (name, 1,0,1). If the mail gateways are also running qrunners, then this machine should have entries (name, 3,0,3) and the others (name, 3,1,3) and (name, 3,2,3).
With (name, 5,0,5) and the others (name, 5,3,5) and (name, 5,4,5), the queues are partitioned into 5 slices and only slices 0, 3 and 4 are being processed leaving 40% of the messages waiting in the in/ queue.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 5 February 2010 18:27, Mark Sapiro <mark@msapiro.net> wrote:
With (name, 5,0,5) and the others (name, 5,3,5) and (name, 5,4,5), the queues are partitioned into 5 slices and only slices 0, 3 and 4 are being processed leaving 40% of the messages waiting in the in/ queue.
Changed to the suggested 3,0,3-3,2,3 and suddenly all the "missing" email came through. <smack forehead>
I've got just 2 more problems left (I hope).
- Invitation/subscription confirmation emails go out with a link in this format: http://lists.domain2.net/confirm/test/526cfbe0f5e7836315ee5f17444aaae8855003...
lists.domain2.net is a virtual host and listed in mm_cfg.py.
Is this easily patched post or pre compile or is it a bit more complex than that?
- If I set the list to have private archives it works perfectly, but if I select public archives I get a forbidden error and this in my apache logs: Symbolic link not allowed or link target not accessible: /var/lib/mailman/archives/public/lists.domain2.net/test
===lists.domain1.net apache vhost file=== <VirtualHost *:80> ServerName lists.domain1.net DocumentRoot /var/www/localhost ErrorLog /var/log/apache2/lists-error.log CustomLog /var/log/apache2/lists-access.log combined
<Directory /var/lib/mailman/archives/>
Options Indexes +FollowSymLinks
AllowOverride None
</Directory>
Alias /pipermail/ /var/lib/mailman/archives/public/
Alias /images/mailman/ /usr/lib/mailman/icons/
ScriptAlias /admin /usr/lib/mailman/cgi-bin/admin
ScriptAlias /admindb /usr/lib/mailman/cgi-bin/admindb
ScriptAlias /confirm /usr/lib/mailman/cgi-bin/confirm
ScriptAlias /create /usr/lib/mailman/cgi-bin/create
ScriptAlias /edithtml /usr/lib/mailman/cgi-bin/edithtml
ScriptAlias /listinfo /usr/lib/mailman/cgi-bin/listinfo
ScriptAlias /options /usr/lib/mailman/cgi-bin/options
ScriptAlias /private /usr/lib/mailman/cgi-bin/private
ScriptAlias /rmlist /usr/lib/mailman/cgi-bin/rmlist
ScriptAlias /roster /usr/lib/mailman/cgi-bin/roster
ScriptAlias /subscribe /usr/lib/mailman/cgi-bin/subscribe
ScriptAlias /mailman/ /usr/lib/mailman/cgi-bin/
ServerAlias lists.domain2.net
ServerAlias lists.domain3.net
</VirtualHost>
FollowSymLinks is enabled and permissions on the link and actual folder are: lrwxrwxrwx 1 www-data mailman 55 2010-02-05 18:10 test -> /var/lib/mailman/archives/private/lists.cantab.net/test/ drwxrwsr-x 5 root mailman 3896 2010-02-05 17:34 /var/lib/mailman/archives/private/lists.cantab.net/test/ drwxrwsr-x 2 root mailman 3896 2010-02-05 13:07 /var/lib/mailman/archives/private/lists.cantab.net/test.mbox/
Any idea whether this is a permissions problem or should I be poking more deeply into my apache config? Root allows override on everything so it didn't seem like that was the problem to me. As always I'm quite possibly wrong there...
Thanks, I really appreciate all the help! Guy
-- Don't just do something...sit there!
Guy wrote:
I've got just 2 more problems left (I hope).
- Invitation/subscription confirmation emails go out with a link in this format: http://lists.domain2.net/confirm/test/526cfbe0f5e7836315ee5f17444aaae8855003...
lists.domain2.net is a virtual host and listed in mm_cfg.py.
Is this easily patched post or pre compile or is it a bit more complex than that?
The initial portion of that URL is the list's web_page_url attribute. This is established at list creation time by interpolating the lists URL host into DEFAULT_URL_PATTERN. It can be changed with fix_url (see <http://wiki.list.org/x/mIA9>).
I'm not sure what you want when you ask if it's easily patched. What do you want?
- If I set the list to have private archives it works perfectly, but if I select public archives I get a forbidden error and this in my apache logs: Symbolic link not allowed or link target not accessible: /var/lib/mailman/archives/public/lists.domain2.net/test
===lists.domain1.net apache vhost file=== <VirtualHost *:80> ServerName lists.domain1.net DocumentRoot /var/www/localhost ErrorLog /var/log/apache2/lists-error.log CustomLog /var/log/apache2/lists-access.log combined
<Directory /var/lib/mailman/archives/> Options Indexes +FollowSymLinks AllowOverride None </Directory> Alias /pipermail/ /var/lib/mailman/archives/public/ Alias /images/mailman/ /usr/lib/mailman/icons/ ScriptAlias /admin /usr/lib/mailman/cgi-bin/admin ScriptAlias /admindb /usr/lib/mailman/cgi-bin/admindb ScriptAlias /confirm /usr/lib/mailman/cgi-bin/confirm ScriptAlias /create /usr/lib/mailman/cgi-bin/create ScriptAlias /edithtml /usr/lib/mailman/cgi-bin/edithtml ScriptAlias /listinfo /usr/lib/mailman/cgi-bin/listinfo ScriptAlias /options /usr/lib/mailman/cgi-bin/options ScriptAlias /private /usr/lib/mailman/cgi-bin/private ScriptAlias /rmlist /usr/lib/mailman/cgi-bin/rmlist ScriptAlias /roster /usr/lib/mailman/cgi-bin/roster ScriptAlias /subscribe /usr/lib/mailman/cgi-bin/subscribe ScriptAlias /mailman/ /usr/lib/mailman/cgi-bin/
You may or may not need all those ScriptAlias directives depending on what else may be on this host. Consider
ScriptAlias / /usr/lib/mailman/cgi-bin/
ServerAlias lists.domain2.net ServerAlias lists.domain3.net
</VirtualHost>
FollowSymLinks is enabled and permissions on the link and actual folder are: lrwxrwxrwx 1 www-data mailman 55 2010-02-05 18:10 test -> /var/lib/mailman/archives/private/lists.cantab.net/test/ drwxrwsr-x 5 root mailman 3896 2010-02-05 17:34 /var/lib/mailman/archives/private/lists.cantab.net/test/ drwxrwsr-x 2 root mailman 3896 2010-02-05 13:07 /var/lib/mailman/archives/private/lists.cantab.net/test.mbox/
Every directory in the /var/lib/mailman/archives/private/ path must be searchable by the web server. Where people normally go wrong is setting g-x on /var/lib/mailman/archives/private itself without making it owned by the web server. See the warning box at <http://www.list.org/mailman-install/node9.html>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Hi Mark,
On 5 February 2010 20:01, Mark Sapiro <mark@msapiro.net> wrote:
- Invitation/subscription confirmation emails go out with a link in this format: http://lists.domain2.net/confirm/test/526cfbe0f5e7836315ee5f17444aaae8855003...
lists.domain2.net is a virtual host and listed in mm_cfg.py.
Is this easily patched post or pre compile or is it a bit more complex than that?
The initial portion of that URL is the list's web_page_url attribute. This is established at list creation time by interpolating the lists URL host into DEFAULT_URL_PATTERN. It can be changed with fix_url (see <http://wiki.list.org/x/mIA9>).
I'm not sure what you want when you ask if it's easily patched. What do you want?
Finally figured that one out. There was nothing wrong with the link itself. We have multiple web servers and I was getting a 404 with the link because it was going to a web server that didn't have mailman on it yet. Another dumb on my part. Sorry!
Every directory in the /var/lib/mailman/archives/private/ path must be searchable by the web server. Where people normally go wrong is setting g-x on /var/lib/mailman/archives/private itself without making it owned by the web server. See the warning box at <http://www.list.org/mailman-install/node9.html>.
That was it! Must have been changed at some point during all the fiddling I've been doing!
Thanks very much Mark!
-- Don't just do something...sit there!
Guy wrote:
On 5 February 2010 20:01, Mark Sapiro <mark@msapiro.net> wrote:
Every directory in the /var/lib/mailman/archives/private/ path must be searchable by the web server. Where people normally go wrong is setting g-x on /var/lib/mailman/archives/private itself without making it owned by the web server. See the warning box at <http://www.list.org/mailman-install/node9.html>.
That was it! Must have been changed at some point during all the fiddling I've been doing!
I'm sure Guy figured out what I meant, and it should be clear from the page linked above, but to avoid potential confusion in the archives, I meant to say "o-x" above, not "g-x".
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Guy
-
Mark Sapiro