[Mailman-Developers] (2.0.6) pipermail takes >1 minute to
rebuild indexes on large lists
Ben Gertzfield
che@debian.org
Wed, 10 Oct 2001 14:44:08 +0900
>>>>> "BAW" == Barry A Warsaw <barry@zope.com> writes:
>>>>> "BG" == Ben Gertzfield <che@debian.org> writes:
BG> The problem was when the mbox got up to about 200-300 megs; I
BG> can send you the traces of the function calls with timestamps,
BG> and you can see exactly how slow things get.
BAW> My biggest lists are python-list at ~280MB followed by the
BAW> zope mailing list which is at about 150MB, and I've got a
BAW> dozen in the 10-100MB range.
BAW> You're sure you're not gzipping on the fly, right?
Absolutely.
[ben@yuubin:/usr/lib/mailman/Mailman]% grep -i gzip Defaults.py 2:40PM
# Set this to 1 to enable gzipping of the downloadable archive .txt file.
# night to generate the txt.gz file. See cron/nightly_gzip for details.
GZIP_ARCHIVE_TXT_FILES = 0
[ben@yuubin:/usr/lib/mailman/Mailman]% grep -i gzip mm_cfg.py 2:40PM
BAW> It would be interesting to see some profiler output.
Here's an example. There are megs and megs where this came from..
Sep 13 19:38:02 2001 (29454) pipelining: ToArchive
Sep 13 19:38:02 2001 (29454) forking...
Sep 13 19:38:02 2001 (29454) forked, pid 29454. calling handler func ToArchive...
Sep 13 19:38:04 2001 (29458) in Message.enqueue() now
Sep 13 19:38:04 2001 (29458) opening file: 733417dfede9cc5f09bf35f40d6c3d279830f653
Sep 13 19:38:04 2001 (29458) opening db /var/lib/mailman/qfiles/733417dfede9cc5f09bf35f40d6c3d279830f653.db
Sep 13 19:38:04 2001 (29458) exception in msg
Sep 13 19:38:04 2001 (29458) msgdata.update newdata
Sep 13 19:38:04 2001 (29458) msgdata.update kws
Sep 13 19:38:04 2001 (29458) writing data file
Sep 13 19:38:04 2001 (29458) done writing data file
Sep 13 19:38:04 2001 (29458) writing dirty/new msg to disk
Sep 13 19:38:04 2001 (29458) done writing dirty/new msg to disk
Sep 13 19:38:06 2001 (29462) in Message.enqueue() now
Sep 13 19:38:06 2001 (29462) opening file: 4a2589b46405fdf1691bb83cba6d638e718b932a
Sep 13 19:38:06 2001 (29462) opening db /var/lib/mailman/qfiles/4a2589b46405fdf1691bb83cba6d638e718b932a.db
Sep 13 19:38:06 2001 (29462) exception in msg
Sep 13 19:38:06 2001 (29462) msgdata.update newdata
Sep 13 19:38:06 2001 (29462) msgdata.update kws
Sep 13 19:38:06 2001 (29462) writing data file
Sep 13 19:38:06 2001 (29462) done writing data file
Sep 13 19:38:06 2001 (29462) writing dirty/new msg to disk
Sep 13 19:38:06 2001 (29462) done writing dirty/new msg to disk
Sep 13 19:38:59 2001 (29454) done with handler func ToArchive.
I can explain in more detail, but it's pretty obvious that ToArchive
starts to thrash pretty badly with a big mbox file.
BAW> I feel it'll tie us to closely to some other project, with
BAW> its own agenda, schedule, compatibility issues, tool chain,
BAW> etc. etc. I'm under no illusions about making Pipermail a
BAW> killer archiver, but I also don't think that most sites need
BAW> much more. I'd rather give folks a moderately useful,
BAW> bundled archiver and tell them where to go if they're running
BAW> a high traffic site.
If we go this route, we must do a big overhaul on pipermail. It
tries to do way too much as it is, and fails spectacularly on
systems other than mine when the mbox file gets too big.
Ben
--
Brought to you by the letters Y and P and the number 12.
"Porcoga daisuki!"
Debian GNU/Linux maintainer of Gimp and GTK+ -- http://www.debian.org/