[Mailman-Users] archiving partial duplicates
dragon at crimson-dragon.com
Fri Apr 7 17:41:26 CEST 2006
Patrick Bogen sent the message below at 08:18 4/7/2006:
>On 4/6/06, Dragon <dragon at crimson-dragon.com> wrote:
> > So if you decided 500 was a good number to index, for the first chunk
> > you would do:
> > bin/arch --wipe --start=1 --end=500 listname
> > For subsequent chunks you would do (adjusting the start and end
> > indexes of course...):
> > bin/arch --start=501 --end=1000 listname
>Bash can do this for you:
>for i in `seq 0 10`
> bin/arch --start=$(( 1+$i*500 )) --end=$(( 500+$i*500 )) listname
>...Assuming that bin/arch --wipe alone does what I think it does.
>Also, if you have more than 5500 messages in the mbox file, you'll
>need to adjust the second argument in the 'seq' upwards. And, of
>course, replace 'listname' with the list name.
---------------- End original message. ---------------------
Or Perl or Python or whatever your favorite scripting language might be...
But that's just icing on the cake and not really a necessity.
The --wipe argument deletes all the old files and builds new ones.
Which is great but ii did have one unfortunate consequence for me. I
am using htdig to allow search of my archives and when I rebuilt the
archives after editing the templates and installing htdig, all of the
file dates on the messages were set to the date I rebuilt the
archive. This destroyed the date context for the archive and the file
dates displayed by the htdig search were not reflecting when the
message was originally posted.
So, I wrote a small Perl script to "fix" the file dates to match the
message date in each of the message files. If anyone is interested in
that script, I would be happy to share it with you. Just e-mail me
directly and I will send it to you.
Venimus, Saltavimus, Bibimus (et naribus canium capti sumus)
More information about the Mailman-Users