[Mailman-Developers]
Re: [Mailman-Users] speeding up archiver in mailman 2.1b3?
Barry A. Warsaw
barry@python.org
Fri Oct 25 21:15:06 2002
>>>>> "ADC" == Andrew D Clark <andrew.clark@ucsb.edu> writes:
ADC> Since I'm hopelessly backlogged in my archive queue (1644
ADC> files), does anyone have any suggestions for speeding up
ADC> archiving? The qrunner process is certainly eating up CPU
ADC> and memory, but is only archiving about 1 msg per minute.
ADC> All the other queues move at a decent pace.
Here's a thought, if you're interested in hacking some code.
In Mailman/Archiver/Archiver.py, ArchiveMail() we create a new
HyperArchive instance each time we want to add a new message to the
archive. That in turn creates a new HyperDatabase instance, which in
turn un-marshals all the state of the archiver.
I wonder if it wouldn't make more sense if the archiver stored the
HyperArchive instance on self and re-used it. That might save a lot
of i/o, although I don't know if it would help much with overall
performance, and I don't know if Pipermail would still operate
correctly. It's worth a shot.
Another idea would to change the scheme the Pipermail archiver used
from a one-file-per-message scheme to a Unix mailbox scheme. The
basic idea would be for ToArchive.py to append the message to a Unix
mbox, and then have ArchRunner.py slurp a multi-message mbox into the
archive instead of doing one message at a time.
I don't have time to play with these ideas, so I'm cc'ing
mailman-developers, in case anyone wants to do some hacking and
profiling.
-Barry