[Mailman-Users] archive/attachment preening

Savoy, Jim savoy at uleth.ca
Fri May 23 00:05:05 CEST 2008


 
Hi all,
 
  We have 870+ mailing lists and about 20 gigs worth of stuff in the
private/archives
directory. I thought I would do a little spring cleaning, such as
sending mail to list
owners to see if they still want their lists. But I would also like to
pare down the size
of the archives directory, which have been allowed to grow wild, using
cron jobs. Before
I do that, I want to get assurance from this list that I fully
understand how the archives
are written.
 
We are currently running Mailman v2.1.5 and we'll upgrade to v2.1.10
later this summer.
 
Please confirm the assumptions I am about to make before I write the
preening
scripts for cron. I also want to write a bunch of scripts that look for
certain information.
 
1) If a list is digestable and archiving was never turned on, then the
   archives/private/listname/index.html will contain only the
originally-created file
   (which basically says "No messages have been posted to this list yet,
so the
   archives are currently empty").
 
2) If there is no mbox file in the /archives/private/listname.mbox
directory, then the list
    has never had archiving turned on.
 
3) If a list is digestable but there is no attachments directory in
/archives/private/listname,
    then the list has never had a message posted to it.
 
4) If the list is digestable, and archiving has never been turned on,
then files in the
    archives/private/listname/attachments directory are only useful to
already-existing
    subscribers who have digesting turned on (ie if I poll a list and it
has no members
    subscribed as digest users, then it is safe to delete all files in
the attachments tree).
 
If all of my above assumptions are correct, my psuedo-code would do
something like this:
 
  if (list is not archived and has no digest members)
      keep stuff in attachments dir for 1 month;
  
  if (list is not archived but does have digest members)
    keep stuff in attachments dir for 1 year;
 
  if (list is archived)
    keep stuff in attachments dir for 3 years;
 
For the archived lists (we have about 150 of them) I will contact the
owners first, to warn them
that I plan to pare their archive down to 3 years max. If they protest,
I will add them as an
exception to the rule and skip over them during the cron job run. I know
that there is more to
be done with regards to reducing the size of archives (ie running arch
--wipe on the editted, pared
down .mbox file, but I will do that manually). For now I am mostly
interesting in keeping the
stuff in the attachments directories to a minimum. I realize that
deleting stuff in /attachments
breaks links in the archive and digest messages, but I think that is
reasonable for the really
old messages (provided the list owner concurs).
 
One final question. I know that you can change a list's settings with
/bin/config_list, but can you
poll a list for settings? For example, you can use "/bin/list_members
-d" to see which members
of a list read in digest mode, but how can I find out which lists have
archiving turned on? Or do I
have to examine the archives/private tree to garner that kind of info?
Thanks!
 
 - jim -
 
 
 


More information about the Mailman-Users mailing list