Graham TerMarsch mailman at howlingfrog.com
Fri Apr 27 22:09:33 CEST 2001

Running Mailman-2.0.1, with Python 1.5.2 on a RedHat 6.2 machine along 
w/Apache-1.3.14 and Sendmail-8.11.0, and am having some serious grief with 
stale lockfiles on one of our lists.  List contains ~60k addresses on it, 
and has constant traffic to the WWW administration pages.  Not high volume 
for sending msgs though (only one or two a day) as its a 
broadcast/announce list.

I'm finding, though, that the WWW processes are regularly creating stale 
locks that sit around and hold everything up.  For fun, I tried using "ab" 
(ApacheBench) to fire up five concurrent "subscribe" requests to our box, 
and saw that it regularly ended up creating stale locks and blocking out 
the rest of the system.  Worse yet, I'm not seeing any useful information 
in the "logs/errors" file, nor am I getting anything useful in 
"logs/locks" (other than seeing that some process got a lock, did its 
thing, and then shut itself down _WITHOUT_ releasing the lock).

>From reading through the notes in "Default.py", it outlines that the 
locking timeouts are probably one of the most important things that can be 
"tuned".  I'm not, however, having any luck in finding information that 
outlines how these values should be tuned for BIG lists.

I'm presuming that what I'm running into is more of a config/tuning issue 
than a serious bug, as I'm sure I can't be the only person running lists 
this large.

Any and all information, tips, pointers, or suggestions are welcome.

Graham TerMarsch

