[Mailman-Users] Suggestions for handling archive growth

Mark Sapiro mark at msapiro.net
Fri Apr 15 00:18:27 EDT 2016

On 04/14/2016 10:36 AM, Gretchen R Beck wrote:
> As our archives approach a terabyte in size, I was wondering if anyone had suggestions or tips for handling archive growth and storage. I've got some ideas, but am wondering what others might be doing.  Just as background, we have a few thousand lists, and support a mid-sized university population, with list creation open to faculty, staff, and students.

It won't help a lot, but remove all the periodic .txt.gz files and
remove the cron/nightly_gzip job from Mailman's crontab. While the
.txt.gz files conceivably save bandwidth when the files are downloaded,
they serve no other useful purpose. The .txt files they come from are
all still there.

If you want to 'prune' older messages from the archives, there is a
script at <https://www.msapiro.net/scripts/prune_arch> (mirrored at
https://fog.ccsf.edu/~msapiro/scripts/prune_arch) that can help with that.

Depending on list configuration, but with normal defaults, there will be
two copies of each scrubbed attachment in the
archives/private/LISTNAME/attachments/ directory. This is because when
scrub_nondigest is No and the list is digestable, the non-plain-text
attachments are scrubbed both from the archive and from the plain text
digest. After a while, the ones whose links were in the plain text
digest are probably not needed any more as few if any copies of the
original digests still exist, and the attachment can always be found via
the archive link.

The trick here is to identify which attachments were scrubbed from a
digest and can therefore be removed.

On the other hand, these days you can buy a couple of terabytes worth of
HDD for $100 US so maybe that's an easier way to go.

Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

More information about the Mailman-Users mailing list