[ mailman-Bugs-835332 ] Stops bloat in pipermail article databases

Bugs item #835332, was opened at 2003-11-03 17:04 Message generated for change (Comment added) made by bwarsaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=835332&group_id=103 Category: Pipermail Group: 2.1 (stable)
Status: Closed Resolution: Accepted Priority: 5 Submitted By: Richard Barrett (ppsys) Assigned to: Nobody/Anonymous (nobody) Summary: Stops bloat in pipermail article databases
Initial Comment: The standard pipermail archiving code saves the body text, in HTML format, of every article in the -article database of each archived list. This bloats the size of these databases. Because they are pickled data structures, which are loaded into memory in their entirety when archiving operations for a list are being handled, this bloat can substantially prejudice archiver performance and in the limit, for lists carrying heavy traffic and/or receiving large text postings, bring archiving to a grinding halt. This patch changes HyperArch.py and pipermail.py so that the data stored in the pipermail <code>$archives/private/ <listname>/database/<period>-article</code> does not include the body text, in HTML format, of each article. This reduces the size of the -article database for each list. The benefits of this are most pronounced with high traffic lists and those to which large text postings are made. The patch also adds a script $prefix/bin/rb-arch which will remove any body text, in HTML format, from existing - article databases; this junk HTML is no longer added when new articles are added to the databases but existing junk HTML is not deleted unless this script is run. The alternative is to run $prefix/bin/arch for a list. Apply the patch from within the Mailman build directory using the command: patch -p1 < path-to-patch-file You use this patch at your own risk and I would appreciate feedback about whether it works for you if you use it or/and any problems you encounter with the patch. ----------------------------------------------------------------------
Comment By: Barry A. Warsaw (bwarsaw) Date: 2003-12-24 11:59
Message: Logged In: YES user_id=12800 Accepted for MM2.1.4. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=835332&group_id=103
participants (1)
-
SourceForge.net