[Mailman-Users] Re: "could not acquire qrunner lock", etc

Mike Crowe mac at babel.fysh.org
Wed Apr 25 12:24:29 CEST 2001


On Tue, Apr 24, 2001 at 08:29:35AM -0700, Steve Pirk wrote:
> I would do some tests on basic mail... Telent to the box on port 25
> and see how long it is before you get the greeting. Do the smae on the
> box to an outside machine. Both should be *very* fast. If they are
> not, look into reverse resolution of the ip address of the mailman
> box.

> qrunner can only deliver mail as fast as the mta, so look there also.

Thanks for the advice. I tried the following from the machine itself (note
that the line wrapping is false):

babel:/tmp> time ((echo 'helo me' ; echo 'mail from: mac at fysh.org' ; echo
'rcpt to: mac at empeg.com' ; echo 'data' ; echo 'From: mac at fysh.org' ; echo
'To: mac at empeg.com' ; echo 'Subject: wibble' ; echo ; echo "wibble" ; echo
"." ; echo "quit") | telnet localhost 25)
Trying 127.0.0.1...
Connected to localhost.fysh.org.
Escape character is '^]'.
220 babel.fysh.org ESMTP Exim 3.12 #1 Wed, 25 Apr 2001 11:10:51 +0100
Connection closed by foreign host.
( ( echo 'helo me'; echo 'mail from: mac at fysh.org'; echo ; echo 'data';
echo   0.01s user 0.02s system 8% cpu 0.352 total


I tried the same thing from another machine on a different network:

mac at morrison:~$ time ((echo 'helo me' ; echo 'mail from: mac at empeg.com' ;
echo 'rcpt to: mac at babel.fysh.org' ; echo 'data' ; echo 'From: 
mac at empeg.com' ; echo 'To: mac at babel.fysh.org' ; echo 'Subject: wibble' ;
echo ; echo "wibble" ; echo "." ; echo "quit") | telnet babel.fysh.org 25) 
Trying 193.119.19.190...
Connected to babel.fysh.org.
Escape character is '^]'.
Connection closed by foreign host.

real    0m0.060s
user    0m0.030s
sys     0m0.030s

It looks pretty good. 

In any case, if the MTA was the problem I wouldn't have expected the
qrunner process to be using lots of CPU - surely it would just be blocked
consuming nothing?

This morning I discovered that the qrunner process had taken over 300
minutes of CPU time and no mailing list traffic was being sent. Clearly it
had got stuck somewhere that its 15 minute timeout didn't work. I killed
the process and the web interface started working again. I don't think mail
has started going out yet because my load is still quite low - I'm off on a
lockfile hunt :-) There was nothing revealing in the qrunner log.

As you can see I'm running exim. I've read the README.EXIM file but I don't
think it applies since I'm not trying to host lists on multiple domains.
My exim.conf does not contain recipients_max so the default value of zero
is being used.

TBH I'd like to just go back to Mailman 1 at this point but I'm worried
that the database files will be incompatible.

-- 
Mike Crowe <mac at fysh.org>




More information about the Mailman-Users mailing list