[Mailman-Users] Archives disk space: are .txt files needed?
Mark Sapiro
msapiro at value.net
Sat Feb 12 02:54:01 CET 2005
Mike Alberghini wrote:
>
>The archive directories contain each months mail in three formats:
>
>1. a plaintext file: 2004-November.txt
>2. a gzipped file: 2004-November.txt.gz
>3. a directory: 2004-November - contains individual HTML messages.
>
>The web archive uses the files in the directory, and links to the gzipped
>file. Does anything use the plaintext file? It seems like it's wasting a
>ton of diskspace having the same file gzipped and unzipped in the same space.
How the .txt file is used depends on the setting of
GZIP_ARCHIVE_TXT_FILES in mm_cfg.py. If this is set to Yes, the .txt
file only exists temporarily while the archiver unzips the .txt.gz and
appends the .txt into a new .txt.gz. With this setting, there are no
permanent .txt files, but this is a very inefficient process (see
comments in Defaults.py).
If GZIP_ARCHIVE_TXT_FILES is No, then the archive is accumulated in the
.txt file and is gzip'd by a nightly cron. In this case, the .txt
files can be deleted for prior months if no new messages ever arrive
for that month. This can't always be guaranteed as a message could be
delayed in transit or have a bad date. In general though, old .txt
files can be deleted, and if a "late" message did arrive and cause
loss of the .txt.gz information, the archive could be rebuilt from the
<list>.mbox/<list>.mbox file with bin/arch.
>So, first off, can I delete the year-month.txt files without causing harm?
Generally, yes after the month is over.
>Second, once the current month is over, can I prevent the non-zipped files
>from ever existing?
You can set
GZIP_ARCHIVE_TXT_FILES - Yes
in mm_cfg.py if you're willing to live with the additional processing
to unzip/rezip the .txt.gz file for each message.
>Finally, is there a way to prevent the archiving of
>attachments?
If you don't want to use content filtering to keep them off the list
entirely, then I think it would require a somewhat tricky hack. You
could modify the code in Mailman/Handlers/Scrubber.py, but this would
also affect digests - that's where it gets tricky.
--
Mark Sapiro <msapiro at value.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users
mailing list