question about strange Mailman glitch
Hello,
A while back I encountered a strange problem with Mailman, which we eventually got resolved, but I don't know why. Although Mailman is working fine now I'm still wondering what the real nature of the problem was and I'm hoping that some of the Mailman experts on this list might be able to provide some insight.
We had already been running Mailman on our Sun for several months without any difficulties. Then, we had an unexpected power outage in our building. The Sun that was running the Mailman server lost power unexpectedly. When it came back up, Mailman would no longer send out mail. All attempts to send mail to mailing lists through Mailman resulted in the sendmail error message "name or service not known" showing up in the smtp error log file. However I was able to send messages directly through sendmail, so I know that sendmail was working properly. I talked to a Mailman expert who suggested looking at the my mm_cfg.py file. The only statement that I had in there was a LOCALHOST statement, with the IP address of the local host, i.e. the Mailman server. My correspondent the Mailman expert recommended deleting this statement from the mm_cfg.py, so I did, and then Mailman started working again! But what's very puzzling is that Mailman was working for us for a long time with this LOCALHOST statement in there, and we didn't have any problems with it. So why is it that this LOCALHOST statement was not causing a problem for us before the power outage but DID cause a problem after the power outage? Anyone have any thoughts on this?
Thanks very much, Eric
Eric Evans wrote:
Then, we had an unexpected power outage in our building. The Sun that was running the Mailman server lost power unexpectedly. When it came back up, Mailman would no longer send out mail. All attempts to send mail to mailing lists through Mailman resulted in the sendmail error message "name or service not known" showing up in the smtp error log file. However I was able to send messages directly through sendmail, so I know that sendmail was working properly. I talked to a Mailman expert who suggested looking at the my mm_cfg.py file. The only statement that I had in there was a LOCALHOST statement, with the IP address of the local host, i.e. the Mailman server.
Are you sure you don't mean
SMTPHOST = 'nnn.nnn.nnn.nnn'
rather than
LOCALHOST = 'nnn.nnn.nnn.nnn'
The latter would do nothing because mm_cfg.LOCALHOST is not referenced in Mailman.
My correspondent the Mailman expert recommended deleting this statement from the mm_cfg.py, so I did, and then Mailman started working again! But what's very puzzling is that Mailman was working for us for a long time with this LOCALHOST statement in there, and we didn't have any problems with it. So why is it that this LOCALHOST statement was not causing a problem for us before the power outage but DID cause a problem after the power outage? Anyone have any thoughts on this?
There are several possibilities. First, maybe someone added
SMTPHOST = 'nnn.nnn.nnn.nnn'
to mm_cfg.py but never restarted Mailman (bin/mailmanctl restart). Thus, the outgoing queue runner continued to use the previous setting (perhaps the default SMTPHOST = 'localhost') until something else (the power outage and restart) caused it to reload mm_cfg.
Possibly some change was made elswhere in the system that wasn't effective until the power outage and restart caused that changed information to finally be read.
If you want to actually do some small tests to find out why
SMTPHOST = 'nnn.nnn.nnn.nnn'
doesn't work, see <http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq06.014.htp> for some debugging techniques.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Out of curiousity, I looked and found the original thread. The last 3 messages in the thread are the most relevant. The first of these 3 is at <http://mail.python.org/pipermail/mailman-users/2005-November/047813.html>
Actually, the part of that old thread that is most relevant here is the following quote from the last post in the thread:
<quote> You said you had
SMTPHOST = '<128.253.175.139>'
and I remarked
Which is probably using sendmail and not trying to connect to an SMTP server at <128.253.175.139> (which BTW is if anything, a name and not an IP address).
I.e '<128.253.175.139>' is not the same as '128.253.175.139'. The latter is an IP address and the former is a name. Maybe something in your DNS or other configuration changed so the name '<128.253.175.139>' could no longer be resolved. </quote>
So I think either
SMTPHOST = '<128.253.175.139>'
was added to mm_cfg.py and Mailman was never restarted until after the power outage (this would not be unusual - the Mailman that runs my production lists has been running for about a year without restart - a sad comment from a Mailman developer, but that particular server is out of my control).
Or possibly some change was made elsewhere that caused the name '<128.253.175.139>' to stop working, but that change wasn't effective until the system restarted after the power outage.
The moral here is always do 'bin/mailmanctl restart' after any mm_cfg.py change or you might be in for an unexpected surprise down the road.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Eric Evans -
Mark Sapiro