[Mailman-Users] Problems regenerating archives from mbox files

Hank van Cleef vancleef at lostwells.net
Thu Mar 4 18:29:45 CET 2010

I'm trying to rebuild the Pipermail archives for a list.  The archive
base runs from 1998 to present, about 25,000 mails/year.  The mbox
files are segmented by year.  What documentation there is for the
bin/arch utility appears to be in its --help printout and in the
Python code.  

Thus far, it appears that we've gotten a complete build, but have
problems with year 2002.  About 250 posts have their headers archived
properly, but without the message text.  That text is archived as 
"no subject" on the date arch was run.  I've done a quick check, just
to idenfity the problem period, and so see what's in the mbox files
for the affected posts.  The posts themselves look correct, and I 
haven't spotted (yet) something in common such as one MUA being used
for the original posts.  Anyway, research proceeds.  I'm using the elm
mailreader as a check and diagnostic program on the assumption that if
elm will read the mail, arch should be able to.  Also using vi to
examine the files.

Qhuestion number one, of course, is why is arch behaving this way?
This is affectingn one year's mbox file's builds.  That mbox (year
2002) was generated under Mailman, and I think it was an early 2.x rev
that was replaced over New Year's 2003.  

Question: Does the --wipe option delete all the archives, or does it
see what period the selected mbox covers?  I've assumed it's all
archives, so doing an incremental rebuild will require storing
anything previously build elsewhere?

Question: Is there a way to do an incremental rebuild and have it
replace already-built archives for a given time period?  Or is that a
manual rm job?

Other things being equal, my plan of attack right now is to set up 
an empty archive directory, do incremental builds in it, then move
each increment to a backup directory, clean out the build directory,
and do the next increment.  When all increments are built clean, I
can then move the backup to production.  If there's a better way,
I'm all ears.


More information about the Mailman-Users mailing list