
- Mark Sapiro <mark@msapiro.net> [121017 16:57]:
If the list's archive is public and you are not a subscriber, your script is probably fine (I didn't look in detail), but if you are willing to subscribe first, whether the archives are private or public, you can get the list's entire cummulative mbox archive with something like
wget 'http://www.example.com/mailman/private/LIST.mbox/LIST.mbox?username=U&password=P'
where LIST is the list name, U is a list member's address and P is that member's list password. This has the advantage of getting all the message's headers as processed by Mailman with the exception of those added by SMTPDirect.py (Sender: and Errors-To:), not just those few that are in the periodic .txt or .txt.gz files.
Thanks, this is great for catching up on subscribed-to lists. I just used this to download the entire history of mailman-users into one 247MB mbox file. The only post-processing required involved removing the first line (which was blank) of the file.
Question -- does that comprehensive mbox file exist on the server somewhere (ie, not generated per request)? I'm wondering if it'd be possible to set up rsync to do incremental updates and mirror backups of an archive to other locations. I'm guessing rsync's delta-transfer algorithm would use roughly the same amount of bandwidth as SMTP... though it would re-write the entire mbox file at the destination with each sync.
But also, I was thinking this could be used to fill gaps in list traffic (when away from the net for extended periods of time & the inbox exceeds number of allowed messages, mail server goes down for some reason, etc.), offering a way to sync up without re-downloading a potentially huge file. But maybe in this case a scheme for limiting the download to a certain date range similar to how Gmane allows setting a range of message numbers in a download URL [1] would make more sense. Is there such a functionality in Mailman?
[1] http://gmane.org/export.php
Regards,
John
-- John Magolske http://B79.net/contact