[Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

Scott Lambert lambert at lambertfam.org
Sat Nov 1 00:21:30 CET 2003

On Fri, Oct 31, 2003 at 03:52:34PM -0500, Scott Lambert wrote:
> Once I kill off the mailman queue runners and clean up the several lock
> files for this mailing list, it runs just fine and manages to empty the
> archive queue.

Well, the above statement is not entirely accurate.  It was working
quickly immediately after restart but went downhill.  I logged out and
took care of other things after seeing it move a good number of messages
in a short amount of time.  Five hours later, it still had 377 messages
in the archive queue and was taking several minutes per message.  I
trussed it again and saw more of the incredibly long series of breaks,
but watched it long than I did this morning.  After a lot of breaks it
goes to a lot of writes then does some file stuff quickly and repeats for 
the next message.

I restarted the queue runners again and it it processed fourty or so
messages quickly then began the downward spiral again.  Within reducing
the queue to 177 entries, it was back to 3 minutes per message and
expanding.  Restarting knocked it down pretty quick for a while then
started taking longer again.  I was watching more closely this time.
After a couple more restart cycles, the queue was cleaned out quickly
and all is well.

I haven't looked at the code yet, and probably won't (ENOTIME), but it
almost sounds to me like it's not pruning it's list of handled messages
and has to walk all of them each time.  I would have expected queue
handling to get faster as the queue got smaller due to fewer files
in the directory that it needs to search through.  Maybe it's just a
function of the python datastructure being used.

The fast after restart part makes me doubt that it is the size of the
archive that is at issue.

The server we are using is a dual PIII450 machine.  I would guess this
would not show as such a big problem on a more modern system, but other
than the archiver, this box is more than enough for the load on it.

The dual processor aspect of this box is what allows us to miss the
archiver running off the deep end until someone complains that the
archive search feature is broken.  The mail passes through the system
just fine using the other processor. 

 38M    2003-October.txt
 13M    2003-October.txt.gz
 48M    portsidelist.mbox

Scott Lambert                    KC5MLE                       Unix SysAdmin
lambert at lambertfam.org      

More information about the Mailman-Users mailing list