Deleting old msgs?
I've seen a post on one method for removing old msgs involving copying a mbox to another location, opening it with a mail client, review/delete msgs, and then replace the mbox, run a process.
I'm very very new to mailman, so forgive me.
Is there any easier way? I just took over our mailman server and we have several years worth of messages in 190+ mboxs totaling approx 130 gig and a few 100K msgs.
Thanks, Frank
A few hopefully simple questions for upgrade to a new server. 2.1.5 --> 2.1.12
- Do I even have to move the Archives folder contents for a migration
- Can't I just regenerate Archives if we need them?
- Since Archives are just a set of folders, can I just right a script to selectively copy newer items based on date??
Thanks!
Frank Bell wrote:
A few hopefully simple questions for upgrade to a new server. 2.1.5 --> 2.1.12
- Do I even have to move the Archives folder contents for a migration
See 2.
- Can't I just regenerate Archives if we need them?
This is the recommended thing. Just copy the archives/private/LISTNAME.mbox/LISTNAME.mbox files and run 'bin/arch --wipe' to regenerate the pipermail archive.
- Since Archives are just a set of folders, can I just right a script to selectively copy newer items based on date??
It's tricky in that that won't remove the old entries from archives/private/LISTNAME/index.html, and even if you manually edit this file, the information is in archives/private/LISTNAME/pipermail.pck file and the removed entries will return.
It's much better to remove the unwanted messages from the archives/private/LISTNAME.mbox/LISTNAME.mbox file and run 'bin/arch --wipe' to regenerate the pipermail archive.
See the FAQ at http://wiki.list.org/x/2YA9 for links to a pruning script.
See the first two paragraphs of the FAQ at http://wiki.list.org/x/2oA9 for info on moving lists.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Frank Bell wrote:
Is there any easier way? I just took over our mailman server and we have several years worth of messages in 190+ mboxs totaling approx 130 gig and a few 100K msgs.
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ at http://wiki.list.org/x/2YA9 for links.
Since this is a brand new process, I suggest you make backup copies of the LISTNAME.mbox files before starting. The script has a --backup option, but I would make separate backups to be sure until you've run the script successfully.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
Frank Bell wrote:
Is there any easier way? I just took over our mailman server and we have several years worth of messages in 190+ mboxs totaling approx 130 gig and a few 100K msgs.
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ at http://wiki.list.org/x/2YA9 for links.
Since this is a brand new process, I suggest you make backup copies of the LISTNAME.mbox files before starting. The script has a --backup option, but I would make separate backups to be sure until you've run the script successfully.
:-) Timing is everything ... just finished integrating mbox-purge.pl (http://www.argon.org/~roderick/mbox-purge.html) with a withlist callable module to do the same thing.
Having fewer layers would be welcome though, so thanks for this script, Mark.
One idea/request: Would you be willing to add the logic to write the pruned message data to a supplied path+filename? That would let the script dump the pruned data where it could be retained or aged via another scheme.
Our site (and maybe this is more common) periodically prunes the archived .mbox messages when they are a year old, rebuilding the pipermail hierarchy, but keeps a compressed copy of the pruned data for another year (or longer), in case it is needed. The compressed pruned .mbox text is considerably smaller (like 1/20th or better on average) when compared to the uncompressed .mbox plus the associated pipermail HTML hierarchy -- so keeping a copy is a relatively trivial insurance policy or "nice to have" for our lists.
We've found that pruning is essential once archives become multi-gigabyte, not for the .mbox archives themselves, but due to the pipermail HTML files that result (particularly for archives with many small messages). We've seen 3 GB .mbox archives with approaching 1 million files in the pipermail hierarchy. Traversing that many files or rebuilding them, particularly for hundreds of such lists, is non-trivial even on modern hardware and file systems.
Thanks in advance for considering adding a way to save the pruned data.
Richard
Richard Haas wrote:
Having fewer layers would be welcome though, so thanks for this script, Mark.
You're welcome.
One idea/request: Would you be willing to add the logic to write the pruned message data to a supplied path+filename? That would let the script dump the pruned data where it could be retained or aged via another scheme.
I have added a -p/--preserve option to collect the pruned messages in archives/private/LISTNAME.mbox/LISTNAME.mbox.pruned, appending to that file if it exists. You could then compress that and save it anywhere you want. If you feel it is important to provide the path rather than using the fixed path above, I can do that, but since my script can process multiple lists in one invocation, it is more complicated. I.e., should all the pruned messages from multiple lists be saved in a single file, or should the list name be inserted into the path name somehow?
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
I have added a -p/--preserve option to collect the pruned messages in archives/private/LISTNAME.mbox/LISTNAME.mbox.pruned, appending to that file if it exists. You could then compress that and save it anywhere you want. If you feel it is important to provide the path rather than using the fixed path above, I can do that, but since my script can process multiple lists in one invocation, it is more complicated. I.e., should all the pruned messages from multiple lists be saved in a single file, or should the list name be inserted into the path name somehow?
Nope, this will more than suffice. Keeping the script's operations in the Mailman hierarchy prevents dependencies on permissions, and follows the precedent of the other scripts. There's no point in getting your python code involved in all the possible local variations, let the calling script (if there is one) handle that logic. Nice.
Thanks again. My mailman-owner non-denominational holiday gift receptical runeth over. :-)
Richard
Hi Mark,
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ at http://wiki.list.org/x/2YA9 for links.
this script is very welcome, but I'm having trouble getting it to work:
/usr/lib/mailman/bin/prune_arch -v -l test -d 1700 -n Traceback (most recent call last): File "/usr/lib/mailman/bin/prune_arch", line 190, in ? main() File "/usr/lib/mailman/bin/prune_arch", line 142, in main except (IOError, mailbox.NoSuchMailboxError), e: AttributeError: 'module' object has no attribute 'NoSuchMailboxError'
The list exists. We're running Red Hat Enterprise Linux, which comes with Python 2.4 ... I've seen that newer versions have that attribute. Mailman 2.1.14 officially supports Python 2.4, so perhaps the script should reflect that?
Update: I noticed that there are so many dependencies on new features of the mailbox module that it doesn't seem feasible to use Python 2.4. Do you know what the minimum required Python version would be? I suggest you add that info to the script.
Cheers, Sebastian
.:.Sebastian Hagedorn - RZKR-R1 (Gebäude 52), Zimmer 18.:.
.:.Regionales Rechenzentrum (RRZK).:.
.:.Universität zu Köln / Cologne University - ✆ +49-221-478-5587.:.
On 12/8/2011 8:31 AM, Sebastian Hagedorn wrote:
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ at http://wiki.list.org/x/2YA9 for links.
this script is very welcome, but I'm having trouble getting it to work:
/usr/lib/mailman/bin/prune_arch -v -l test -d 1700 -n Traceback (most recent call last): File "/usr/lib/mailman/bin/prune_arch", line 190, in ? main() File "/usr/lib/mailman/bin/prune_arch", line 142, in main except (IOError, mailbox.NoSuchMailboxError), e: AttributeError: 'module' object has no attribute 'NoSuchMailboxError'
The list exists. We're running Red Hat Enterprise Linux, which comes with Python 2.4 ... I've seen that newer versions have that attribute. Mailman 2.1.14 officially supports Python 2.4, so perhaps the script should reflect that?
I can look into revising the script to use the mailbox.UnixMailbox class instead. which will work with python 2.4.
Note that I run a production CentOS 5 server on which I have installed Python 2.6.5 from source. The default Python on this server is still 2.4.3 as changing the default breaks yum and possibly other things, but I use python2.6 for things I install from source like Mailman, bzr and mod_wsgi.
Update: I noticed that there are so many dependencies on new features of the mailbox module that it doesn't seem feasible to use Python 2.4. Do you know what the minimum required Python version would be? I suggest you add that info to the script.
It should work with Python 2.5. I will either fix the script to work with Python 2.4 or add a "requires at least Python 2.5" note to it.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro wrote:
On 12/8/2011 8:31 AM, Sebastian Hagedorn wrote:
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ at http://wiki.list.org/x/2YA9 for links.
this script is very welcome, but I'm having trouble getting it to work:
/usr/lib/mailman/bin/prune_arch -v -l test -d 1700 -n Traceback (most recent call last): File "/usr/lib/mailman/bin/prune_arch", line 190, in ? main() File "/usr/lib/mailman/bin/prune_arch", line 142, in main except (IOError, mailbox.NoSuchMailboxError), e: AttributeError: 'module' object has no attribute 'NoSuchMailboxError'
The list exists. We're running Red Hat Enterprise Linux, which comes with Python 2.4 ... I've seen that newer versions have that attribute. Mailman 2.1.14 officially supports Python 2.4, so perhaps the script should reflect that?
I have revised the script and it should now work with all Python versions that are acceptable for Mailman itself.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
--On 8. Dezember 2011 16:52:06 -0800 Mark Sapiro mark@msapiro.net wrote:
The list exists. We're running Red Hat Enterprise Linux, which comes with Python 2.4 ... I've seen that newer versions have that attribute. Mailman 2.1.14 officially supports Python 2.4, so perhaps the script should reflect that?
I have revised the script and it should now work with all Python versions that are acceptable for Mailman itself.
Thanks a lot!
.:.Sebastian Hagedorn - RZKR-R1 (Gebäude 52), Zimmer 18.:.
.:.Regionales Rechenzentrum (RRZK).:.
.:.Universität zu Köln / Cologne University - ✆ +49-221-478-5587.:.
You did the work, so I'll just thank you very, very much!!
Have a merry Christmas or Hanukkah or .........
Thank you, Frank Bell Application Systems Admin Information Systems and Services Washburn University Topeka, KS 66621 785-670-2334
On 12/2/2011 5:25 PM, Mark Sapiro wrote:
Frank Bell wrote:
Is there any easier way? I just took over our mailman server and we have several years worth of messages in 190+ mboxs totaling approx 130 gig and a few 100K msgs.
You inspired me. I've created a script for pruning archives. See "NOTE ON PRUNING OLD MESSAGES:" in the FAQ athttp://wiki.list.org/x/2YA9 for links.
Since this is a brand new process, I suggest you make backup copies of the LISTNAME.mbox files before starting. The script has a --backup option, but I would make separate backups to be sure until you've run the script successfully.
participants (4)
-
Frank Bell
-
Mark Sapiro
-
Richard Haas
-
Sebastian Hagedorn