[Mailman-Users] Understanding Archiving

Mark Sapiro mark at msapiro.net
Fri May 21 17:05:46 CEST 2010

Patricia A Moss wrote:

>I am running mailman, version, on a RedHat, version 4.0, 
>I am trying to understand how archiving works and/or is set up and 
>I need to understand the difference between the directory and the .txt 
>file (i.e. Directory: "2009-December" and File: "2009-December.txt") 
>located within my .../archives/private/mailman/ subdirectory.
>My partition, that houses the archives, is running out of space.  I am 
>trying to figure out what, if anything, I can clean up while I wait for 
>approval for my new server.
>Can someone please assist.  I have been searching the threads on the 
>mailing list but can not seem to find the answer I seek.  Thanks, in 

If you look at the overall archive TOC for a list, you will see entries

May 2010:  [ Thread ][ Subject ][ Author ][ Date ] [ Text xx KB ]
April 2010:[ Thread ][ Subject ][ Author ][ Date ] [ Gzip'd Text xx KB ]

In the above, the [ Text xx KB ] link is to the 2010-May.txt file which
is a mailbox like file containing that month's messages. It is not the
archives/private/LISTNAME.mbox/LISTNAME.mbox cumulative mailbox which
contains all list posts and which can be used to rebuild everything in
the archives/private/LISTNAME directory.

The [ Gzip'd Text xx KB ] link is to the 2010-April.txt.gz file if
there is one.

The actual pipermail archive with the thread, subject, author and date
indices and all the nnnnnn.html message files, etc. is in the various
yyyy-Month/ directories.

If you are short on space, I recommend the following.

Comment out or remove from Mailman's crontab the cron/nightly_gzip
entry that makes the .txt.gz files, and make sure you do NOT have
GZIP_ARCHIVE_TXT_FILES = Yes in mm_cfg.py.

Then you can remove all the .txt.gz files. They just take extra space
because the corresponding .txt files are there anyway. They may save a
little bandwidth, but it's insignificant.

If the archives/private/LISTNAME/attachments directory is large,
consider rebuilding the entire archive with bin/arch --wipe, although
it is a good idea to first check the LISTNAME.mbox file with
bin/cleanarch, and it is possible that this may result in archived
messages being renumbered, thus invalidating any saved archive URLs.
This may not happen, but if it is a concern, backup and test.

Rebuilding the archive may help because in the case of digestable
lists, each 'attachment' is scrubbed and stored twice, once when the
message is archived and once when the plain digest is produced.

For a temporary situation, you can remove anything other than the
archives/private/LISTNAME.mbox/LISTNAME.mbox file, and reconstruct the
archive later with bin/arch --wipe.

Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

More information about the Mailman-Users mailing list