Re: [Mailman-Developers] Re: 2006 archives already online!
Marc MERLIN <marc_news@valinux.com> writes:
On Mon, Apr 30, 2001 at 09:28:51PM +0200, Fil wrote:
@ Darrell Fuhriman (darrell@grumblesmurf.net) :
That's why under "archival options" you'll find:
"Set date in archive to when the mail is claimed to have been sent, or to the time we resend it?"
If you leave it on 'when sent', you deserve the mess you'll get in your archives.
"When sent" is the default. How do you change the default to "When Resent"? Shouldn't it be changed in the mailman's Defaults.py ?
I very firmly believe this, and so do all the people who have archives showing messages with dates of 2004 or 1990. http://lists.svlug.org/pipermail/svlug/
I've asked the same thing in the past, but it didn't go through.
What I did for the gnome.org archives (using mhonarc plus custom perl) is to used the Received: header for the date.
Which is, almost always, quite close to the time the person actually sent it, and assuming that your local server's time isn't screwed up (which is a much bigger problem...) does not have the 2004 problem.
And it has the advantage over clobber_date of:
- Not munging the mail
- Not being skewed by moderation delays
- Being independent of the archiving process, so if you import a bunch of old mail with incorrect Date: lines into the archiving process you still get the 2004 protection.
I haven't looked at the clobber_date implementation recently, so I don't know if these problems have been otherwise addressed.
This approach has, anyways, worked well for the 140k messages on mail.gnome.org. I wouldn't be suprised if it has defects though.
Regards, Owen
"OT" == Owen Taylor <otaylor@redhat.com> writes:
OT> What I did for the gnome.org archives (using mhonarc plus
OT> custom perl) is to used the Received: header for the date.
Ah, but which one? :) There's going to have a Received: header for each hop that message takes. By the time your message got to me, it had 7 Received: headers, and 3 (I think) by the time it reached Mailman.
OT> Which is, almost always, quite close to the time the person
OT> actually sent it, and assuming that your local server's time
OT> isn't screwed up (which is a much bigger problem...) does
OT> not have the 2004 problem.
OT> And it has the advantage over clobber_date of:
| - Not munging the mail
True, with the disadvantage that if you use an external archiver, it'll have to handle checking for outrageous dates. clobber_date munges the message before it hits either archiver (Pipermail or external). If I was smart, I'd also count as a major disadvantage the fact that I'll have to track down all the places where the Date: header is used in Pipermail, and I /hate/ diving in that code. ;(
| - Not being skewed by moderation delays
Dang, yep, but fixable.
| - Being independent of the archiving process, so if you
| import a bunch of old mail with incorrect Date: lines
| into the archiving process you still get the 2004
| protection.
True, with the caveat above.
This would be a reasonable option, however if you use the most recent Received: header, won't you still be subject to local server clock skew? And if you use the earliest Received: you'll be subject to the same bogosity in the Date: header. Or do you just start parsing the Received:'s back from the most recent and take the first sane one you find?
-Barry
participants (2)
-
barry@digicool.com
-
Owen Taylor