[Mailman-Users] attachements question
Stephen J. Turnbull
stephen at xemacs.org
Thu Sep 29 07:29:58 CEST 2005
>>>>> "Brad" == Brad Knowles <brad at stop.mail-abuse.org> writes:
Brad> At 2:03 PM +0900 2005-09-28, Stephen J. Turnbull wrote:
>> Why archivers don't use Message-Id for the URL, I don't know.
Brad> Because some MUAs generate message-ids that are likely
Brad> to collide.
Can we stop pandering to the broken mailers, please? Are we not
hackers? We know how to handle collisions. Here's the algorithm:
1) Look for a unique ID in the (X-)List-Archive-Message-ID field. If
not found:
a) Generate a unique ID according to the usual algorithm as if
the post were about to be sent from the archive host.
b) Add it to the header in the (X-)List-Archive-Message-ID field.
2) Extract the message ID from the message. If none, set the program
variable equal to the ID generated in 1, and (optionally) add it as
Message-ID to the message's header.
4) Generate the URL for the archived message based on *Message-ID*.
5) Check for collision.
6) If there is a collision, make a directory (could be a file-system
directory, could be just an HTML file, could be a digest message)
with the URL generated in 4. Generate URLs for the colliding
messages based on Message-ID plus List-Archive-Message-ID, and
include them in the directory. Conforming implementations MAY also
extract MUA information and make nasty comments about the broken
MUAs, their implementers, and their users to go with the directory.
If Message-ID == List-Archive-Message-ID, go to 1a. At this point
a conforming implementation MAY mail /vmunix to its implementer,
who obvious snafu'ed.
7) PUT the colliding messages at those URLs.
Rationale:
1. You could actually derive an URN from this:
archived-message://list-archive.your.org/MESSAGE-ID.
2. The URL is unique and will persist across regeneration of the
archive as long is the message is present.
3. People who use conformant software implemented competently should
be given precedence.
4. Users who don't subscribe to the archiver's client but somehow get
their hands on a message ID can use Google to find it (and the
rest of the thread).
5. People who use software that doesn't conform will suffer.
Brad> For some time now, I've been arguing that they should use a
Brad> hash of the relevant information (maybe all the headers,
Brad> maybe just selected headers, maybe the entire message,
Brad> whatever is reasonable to assume will survive), making sure
Brad> to at least include the value of the "Date:", "Message-ID:",
Brad> and "Received:" headers as part of that input.
This gives 1 and 2, but not 3, 4, and 5. (No, you can't generate a
Google search item from knowledge of the algorithm because you don't
necessarily have the Received headers.) Seems like overkill for Step
1 of the algorithm, too.
--
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Ask not how you can "do" free software business;
ask what your business can "do for" free software.
More information about the Mailman-Users
mailing list