Streamlining email archives?

Dinu Gherman gherman at
Wed Jun 4 10:02:00 CEST 2003


having some trouble with my emailer which gulps at single mail-
boxes containing 20,000 messages and more, I'm looking for a
more deterministic method of downloading parts of the stocked
archives for lists such as this one. Then, I'd be able to pe-
riodically download only the recent two months or so into my
emailer. I know, there's also Netnews, but that's not an op-
tion in this case.

Looking at the archives stored e.g. on this page I find that
they could lend themselves better to automated processing:

Specifically, I wonder why the full archive is downloadable
only as non-compressed 576 MB chunk? Or why the other non-
compressed files have an .txt extension while, but not the
huge 576 MB one, which has a .mbox extension? Also, it isn't
obvious to me which rule is followed for deciding when to have
a compressed archive stored there and when to have the uncom-
pressed raw mailbox? I assume mailboxes for the last 12 months
or so are uncompressed, but this is not exactly true. And it
might be different for other lists...

And then there is the phenomenon of postings from the future,
which would make sense for some of the bots on the list, but
they mostly come from nobots. Well, I agree, this is certainly
something completely different... but perhaps an interesting
issue for Mailman?

Hence, is there some common policy behind stocking these ar-
chives, and is it specified somewhere? If this is not the case,
wouldn't it be useful to do so, if only because then things
would be more predictable and people could further automate
their processes.



Dinu C. Gherman
"Consistency is the last refuge of the unimaginative." (Oscar Wilde)

More information about the Python-list mailing list