Deleting messages from archive....
Has anyone ever developed a point-n-click tool to delete a message from an archive, preferably by just eliminating the From/To/Subject and body, but leaving the links intact, with a placeholder message that says "message content removed", etc?
I found lots of FAQ's and posts on the 'manual' way to do it. Just surprised there were no references to a quick-n-easy tool....
It would be really nice if this could integrate with the mailman admin pages, so that list owners could manage their own archives.... Alternately, an e-mail control (which I could key to sender address) would suffice.
- Charles
Charles Gregory wrote:
Has anyone ever developed a point-n-click tool to delete a message from an archive, preferably by just eliminating the From/To/Subject and body, but leaving the links intact, with a placeholder message that says "message content removed", etc?
Not as far as I know.
I found lots of FAQ's and posts on the 'manual' way to do it. Just surprised there were no references to a quick-n-easy tool....
It would be really nice if this could integrate with the mailman admin pages, so that list owners could manage their own archives.... Alternately, an e-mail control (which I could key to sender address) would suffice.
One issue is the tool wouldn't be so simple. You can't just edit the HTML article because if that's all you do, the content will return if you ever rebuilt the archive. The content is in 3 or 4 places - the LISTNAME.mbox/LISTNAME.mbox raw archive, the HTML article, the periodic .txt file and the periodic .txt.gz file if any.
The cleanest way to do it is to edit the LISTNAME.mbox/LISTNAME.mbox and rebuild the pipermail archive, but that is an expensive process and can have unintended side effects.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Fri, 15 May 2009, Mark Sapiro wrote:
One issue is the tool wouldn't be so simple. You can't just edit the HTML article because if that's all you do, the content will return if you ever rebuilt the archive. The content is in 3 or 4 places - the LISTNAME.mbox/LISTNAME.mbox raw archive, the HTML article, the periodic .txt file and the periodic .txt.gz file if any.
(nod) And for what I want, it would have to edit each of those spots, to avoid any link-breaking due to numbering issues....
The cleanest way to do it is to edit the LISTNAME.mbox/LISTNAME.mbox and rebuild the pipermail archive, but that is an expensive process and can have unintended side effects.
As noted in the (excellent) FAQ's, this disturbs external links. So I'm going for the nitty-gritty version....
- Charles
- Mark Sapiro <mark@msapiro.net>:
Charles Gregory wrote:
Has anyone ever developed a point-n-click tool to delete a message from an archive, preferably by just eliminating the From/To/Subject and body, but leaving the links intact, with a placeholder message that says "message content removed", etc?
Not as far as I know.
It would totally rule. Especially on python.org, where people regret posting their full name because "employers tend to google them", and then come running to us, asking "Please remove my name".
Fucktards.
But I like the idea of such a function.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12200 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 Ralf.Hildebrandt@charite.de | http://www.charite.de
On May 15, 2009, at 5:04 PM, Ralf Hildebrandt wrote:
- Mark Sapiro <mark@msapiro.net>:
Charles Gregory wrote:
Has anyone ever developed a point-n-click tool to delete a message from an archive, preferably by just eliminating the From/To/Subject and body, but leaving the links intact, with a placeholder message that says "message content removed", etc?
Not as far as I know.
It would totally rule. Especially on python.org, where people regret posting their full name because "employers tend to google them", and then come running to us, asking "Please remove my name".
Fucktards.
But I like the idea of such a function.
I've long thought that all archives should be vended dynamically
rather than statically, of course with a cache to improve performance
as necessary. This would allow you to do lots of interesting things,
such as add links dynamically (e.g. "bug 12345" pointing to your bug
tracker), or on-the-fly modification of archive style or anti-spam
obfuscation, with the proper cache invalidation.
This would also allow you to "delete" a message from the archive, by
laying down a marker that causes the program to instead return a "not
available" or cleansed message without affecting the underlying data.
-Barry
Removing mailman-developers because I'm not talking about implementation.
Barry Warsaw writes:
I've long thought that all archives should be vended dynamically
rather than statically, of course with a cache to improve performance
as necessary.
I think it's a great option. But first Mailman would have to support archives, which it doesn't, really. pipermail is fine for small projects and some larger projects that are willing to put up with its limitations, but it's pretty creaky. Anything else (eg MHonArc) is "you do it yourself, see the tracker".
IMO, trying to support archiving is mission creep the project should avoid, except for providing hooks to make it easier to use 3rd party archivers.
N.B. That doesn't mean Barry and Mark shouldn't contribute to archivers, if they want to.
On May 16, 2009, at 3:42 AM, Stephen J. Turnbull wrote:
Removing mailman-developers because I'm not talking about
implementation.Barry Warsaw writes:
I've long thought that all archives should be vended dynamically rather than statically, of course with a cache to improve
performance as necessary.I think it's a great option. But first Mailman would have to support archives, which it doesn't, really. pipermail is fine for small projects and some larger projects that are willing to put up with its limitations, but it's pretty creaky. Anything else (eg MHonArc) is "you do it yourself, see the tracker".
IMO, trying to support archiving is mission creep the project should avoid, except for providing hooks to make it easier to use 3rd party archivers.
Mailman 3 makes it very easy to add archivers. I already have
implementations of hooks for Pipermail, MHonArc, and mail-
archives.com. They're not mutually exclusive btw.
N.B. That doesn't mean Barry and Mark shouldn't contribute to archivers, if they want to.
I've long thought that Pipermail should be split off from Mailman as a
project, perhaps still bundled in whatever sumo distribution we
provide. It would be very cool if a group of people worked together
to make Pipermail not suck.
-Barry
On Sat, 16 May 2009, Barry Warsaw wrote:
I've long thought that Pipermail should be split off from Mailman as a project, perhaps still bundled in whatever sumo distribution we provide. It would be very cool if a group of people worked together to make Pipermail not suck.
As a side note, is there a limit to how large the archive mbox can grow? I've got several years in an archive now, and its pushing past 64MB... Am I going to hit a limit soon? (CentOS 4, if that matters)
- Charles
Charles Gregory wrote:
As a side note, is there a limit to how large the archive mbox can grow? I've got several years in an archive now, and its pushing past 64MB... Am I going to hit a limit soon? (CentOS 4, if that matters)
Many older *nix OSs limit files to 2GB (32 bit integer). Newer OS versions don't have this limit. Mailman imposes no limit beyond those of your OS and available storage.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Fri, 15 May 2009, Ralf Hildebrandt wrote:
Has anyone ever developed a point-n-click tool to delete a message from an archive, preferably by just eliminating the From/To/Subject and body, but leaving the links intact, with a placeholder message that says "message content removed", etc? It would totally rule. Especially on python.org, where people regret posting their full name because "employers tend to google them", and then come running to us, asking "Please remove my name".
Not withstanding your opinion of people who do this (which I cannot entirely disagree with - grin) you have brough to mind a number of scenarios where it might be beneficial to be able to quickly edit a message, rather than delete it completely. For instance, someone posts a message with someone else's (unapproved) web link. The link needs to be removed, but the rest of the posting can stay as it is....
As I think about this more deeply, I realize there are serious programming considerations. For example, the mesage subject appears in the pages that 'index' the archive, as well as in the 'previous' and 'next' links of neighboring messages, sometimes not even in the same week/directory.... I think if we permit editing of the subject it will have to, by default, change ALL references to that subject in all messages, just to keep things simple.... Presumably the only time a subject would be changed would be if it is offensive/abusive, so you would *want* to change alll of them. If some moron posts messages under the wrong subject, that is life on the list. We're not fixing that..... LOL
Of course, there are some ethical issues to visit. Any editor should be certain to insert a disclaimer in the message body which says "THIS MESSAGE HAS BEEN EDITED BY LIST ADMINS SUBSEQUENT TO POSTING", just so that original authors do not feel they've been misrepresented, etc, etc.
I can presume that if someone is comfortable with regenerating their archive from the mbox at ANY time, they wouldn't mind editing it for occasions like these. So I'm going to presume that this utility that is forming in my mind is for people who have already decided that they will *never* regenerate their archives from mboxes. Numbering and links will always be preserved.... So we need only be concerned with a utility that will updates the html and txt archives...
- Charles
participants (5)
-
Barry Warsaw
-
Charles Gregory
-
Mark Sapiro
-
Ralf Hildebrandt
-
Stephen J. Turnbull