OutgoingRunner hangs and messages not delivered to mail lists

Greetings,
I have a problem with all of the mailing lists in our server.
Every day the qfiles/out directory gets hundreds of *.pck files not posted to the mailing lists.
Manually, one can stop the master qrunner through mailmanctl stop and start it again using mailmanctl start . But every time I ve done that theres a lock and I manually removed the locks and issued the mailmanctl start . That worked .
But , analysing the processes on the system I have noticed that are several OutgoingRunner s , and the only way to get the messages posted to the mailing lists is killing those processes.
All the archives works well that is: the messages are archived but not posted I know that the archiving process is separated from posting.
Its quite unusual because its the first time (3 weeks ago) this is happening after several years of using mailman.
I am managing about a hundred of mailing lists with thousands of users.
I am using:
Red Hat Enterprise Linux AS release 4
Python 2.3.4
mailman 2.1.9
sendmail-8.13.1-2
Ive searched through the mailman FAQs and Ive not found any clue.
I made a shell script in order to minimize the problem when I am on weekends and have inserted on the cron (running every 10 minutes):
#!/bin/sh
/home/mailman/bin/mailmanctl stop
ps auxwf|grep python
ps ax |grep python|grep -v grep|gawk -F " " '{print $1;system("kill -9 "$1)}'
/home/mailman/bin/mailmanctl -s start
I would appreciate any help on this issue,
Joao Sa Marta
===============================|===================================
= Joao Sa Marta | Email: samarta@ci.uc.pt <BLOCKED::mailto:samarta@ci.uc.pt> =
= Centro de Informatica | Tel: 239 853178 (directo) =
= da Universidade de | Tel: 239 853170 (Geral) =
= Coimbra | Fax: 239 853189 =
= Apartado 3080 | http://www.uc.pt/pessoal/samarta =
= 3001-401 Coimbra | =
===============================|===================================

João Sá Marta wrote:
Every day the qfiles/out directory gets hundreds of *.pck files not posted to the mailing lists.
Manually, one can stop the master qrunner through mailmanctl stop and start it again using mailmanctl start . But every time I've done that there's a lock and I manually removed the locks and issued the mailmanctl start . That worked .
But , analysing the processes on the system I have noticed that are several OutgoingRunner's , and the only way to get the messages posted to the mailing lists is killing those processes.
That's because you are force starting Mailman when it hasn't fully stopped. Lock files have a reason.
See FAQ <http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq04.068.htp>.
It's quite unusual because it's the first time (3 weeks ago) this is happening after several years of using mailman.
It appears there may be some queued message in the out/ queue causing a problem. There may also be a problem between Mailman and Sendmail. What is in Mailman's 'smtp' and 'smtp-failure' logs and also Sendmail's logs?
Is this a Debian Mailman and Python package? If so, see a couple of threads in the mailman-users archive. One thread begins at <http://mail.python.org/pipermail/mailman-users/2006-October/053942.html> (the link above) and is a single thread in the archive. The other thread is broken in the archive because of many off-list replies, but in subject sequence, it is those messages with subjects
[Mailman-Users] can't get mailman and exim4 talking on my debian 3.1 [Mailman-Users] slow processing [Mailman-Users] Still can't send out from Mailman
If there is no problem evident in the smtp or smtp-failure logs, look for an older message in the out/ queue and try moving it aside.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
João Sá Marta
-
Mark Sapiro