[Mailman-Users] mbox files

Paul Tomblin ptomblin at xcski.com
Mon Dec 4 02:10:11 CET 2006

Quoting Mark Sapiro (msapiro at value.net):
> Paul Tomblin wrote:
> >slowed my computer down to a crawl.  I gave up and used the mbox splitter
> >awk program I found in the list archives and I'm now building the archives
> >500 messages at a time.  Hope that works.
> It should.
> Also, you can effectively do the same thing without breaking up the
> mbox by using the --start= and --end= options on bin/arch. See
>  bin/arch --help

Is there any way to make arch smarter about "^From " lines?  First pass
through the archive, I ended up with a bazillion messages in the archive
for today, all with "No subject" because it was treating any line like
"^From " as the start of a message.  It would be nice if it recognized the
difference between real mbox start-of-message "^From " and just random
lines from some list member.

I suspect that this probably isn't generically possible.  I made use for
the fact that real mbox start-of-message lines have the listname in the
"^From " line, and made my mbox splitter awk script put a ">" in any
"^From " line that didn't have the listname in it.

So far (on message 30,000), it has only given me one "No subject" line in
the archive, but that's because somebody actually put a real
start-of-message line in his message because he was complaining about slow
message delivery and copied the whole message header.

Paul Tomblin <ptomblin at xcski.com> http://blog.xcski.com/
"Once you have flown, you will walk the earth with your eyes turned skyward,
for there you have been, there you long to return." -- Leonardo da Vinci.

More information about the Mailman-Users mailing list