[Mailman-Users] Re: Troubleshooting: no mail goes out to lists - common things to check
peter schoch
pschoch at sussex.cc.nj.us
Sat Nov 9 23:31:22 CET 2002
Jon, this was a great posting! I think it would probably be good to add to the FAQ page on the mailman site. I'd add that you might need to tweak sendmail's settings to allow some domains to forward to get this to work.
All of your 'checks' were OK for me, and yet the lists were still not working. Then, on Friday, the lists just started sending.
One of the subscribers had forgotten to remove their auto-reply "Out of Office" message. It replied multiple times to the subscription notices, and then the lists started and the auto-reply caused a huge number of posts! But now the lists have kept working.
I have no errors, etc. in the logs. I have no idea of why it just started. I didn't change anything, modify anything... I'd feel better if I knew why, but...
Tahnks for all of the help.
Peter Schoch
Here are some common things to check when no mail is going out from your
lists.
======
I'm going to assume Sendmail as the MTA (its still the most commonly
found - though postfix is gaining ground):
0) Check_perms. In all cases you should start by checking the
permissions on the files that were setup:
~mailman/bin/check_perms
1) Cron. Make sure that the cron daemon is running
ps -aux |grep cron |grep -v grep
This will print out the process information about the cron
daemon. If it returns a blank line, then cron is NOT running.
2) Aliases. To create a mailman list you ran "newlist" and it
printed out four lines that you needed to copy to the
/etc/aliases file (or wherever your MTA goes to find its
aliases). Check that the aliases are in /etc/aliases:
grep wrapper /etc/aliases
Even if the aliases are there, you may still need to reset
the aliases hash table so that it includes this new alias
information:
newaliases
Here is a typical alias listing for a group called "sys":
## system mailing list
sys: "|/home/mailman/mail/wrapper post sys"
sys-admin: "|/home/mailman/mail/wrapper mailowner sys"
sys-request: "|/home/mailman/mail/wrapper mailcmd sys"
sys-owner: sys-admin
3) Smrsh. Check to see if your MTA uses smrsh. Red Hat as well
as a few other distributions automatically setup Sendmail to
use smrsh. Smrsh stops Sendmail from running a script or
other program that is included in an alias. Mailman uses a
program called "wrapper" to run all of its aliases (see the
alias examples above):
grep "smrsh" sendmal.cf
If this comes up blank then Sendmail does not use smrsh;
if not, then your server is probably running smrsh and you
need to make sure that smrsh is setup to allow Mailman's
wrapper program to run. Locate the smrsh directory and do
an ls -l of that directory. On Red Hat:
ls -l /etc/smrsh
and the output should be similar to:
wrapper -> /home/mailman/mail/wrapper
4) Interface. Some distributions in a noble "attempt" to limit
the number of open relays on the internet, default Sendmail
so that it listens to a limited number of interfaces. The
default interface that Mailman list's use is localhost
(127.0.0.1) - this is configurable in Mailman's mm_cfg.py
file. To check Sendmail's configuration file:
grep "Port" sendmail.cf
This will list out the DeamonPortOption and indicate the
interfaces it listens on (0.0.0.0 would mean all interfaces).
You can also check out which interfaces your MTA is listening
on by using:
netstat -na |grep ":25 "
5) Qrunner. If you are running Mailman 3.0x then qrunner is
run every minute via a cron job (that is why cron *must* be
running for Mailman to work). Try running the program by
hand. The exact syntax can be found in Mailman's cron jobs:
su mailman
crontab -l
Here is an example of running qrunner by hand:
su mailman
/usr/bin/python -S /home/mailman/cron/qrunner
If this generates any errors then send those to the list
for diagnosis - or look at the last few lines of errors and
search the list for key words from the error messages.
6) Locks. A errant lock file can stop a list from processing as
Mailman waits for the lock to be removed. Since your list is
not sending, we shall assume that no lock files should be on
the list and that it is safe to delete any we find.
ls -l ~mailman/locks
The output will be something like:
qrunner.lock.moya.trilug.org.22845
This indicates that process # 22845 created the lock. To look
at this process and see what it is (if it still exists):
ps aux |grep 22845 |grep -v grep
7) Logs. If you don't have any of the common problems above,
then you should look for errors in your log files.
First look for errors in your MTA log files. On Red Hat that
would be in /var/log/maillog.
Look in the log starting at the time you sent a test message.
You should see your initial message come in and be passed
onto to your Mailman list, afterwards you may see warnings
or errors caused by Mailman trying to send out mail to the
members of the list.
Next look in Mailman's logs. The files are in ~mailman/logs/.
There are several logs to look in for problems:
error
smtp-failure
smtp
vette
config
post
Note: if you look in the qrunner log you will see several
warnings about "Could not acquire qrunner lock", these are
actually normal and are NOT a problem.
Every line in the log files is dated so you should be able to
isolate the place in the log files to start looking, based on
when your problem started.
8) Qfiles. You may have a malformed email (or one that is simply
too big) clogging up the flow of mail to your lists. Mail
that is queued up by Mailman is stored in the directory:
~mailman/qfiles
Move any files out of the directory and into a temporary
directory, then send a new test message to your list. If that
works, then you can move some of the old queued up files back
and let those process. If it stops working again then you
have a bad message in that batch - delete them or copy them to
a different temporary directory.
======
Please feel free to critic and expand on this. I'm hoping that it
proves useful as a starting point for folks having mail-flow problems.
-- Jon Carnes
More information about the Mailman-Users
mailing list