Barry Warsaw writes:
First, I want to avoid talking about file system layout. To me,
that's an implementation detail we needn't worry about right now.
How likely is it that two messages with the same message-id and date are /not/ duplicates?
For message id generators that include a time-stamp in the generated id, approximately the same as the probability that two messages with the same message-id are not duplicates, no?
Heck, at that point, I'd feel justified in simply automatically rejecting the duplicate and chucking it from the archive.
I'd rather not go there. There may be applications for the archiver that require that all mail received be filed.
Counterproposal: have a "collisions" namespace, and provide an interface for the list owner to decide what to do with them. They could be thrown away, they could be given an alternative global ID somehow and added (eg, the archive page could add a "See probable duplicates too" link), or they could be put into a moderation-like queue for list admins to decide about.
So now, think of the interface to a message store that supports this
addressing scheme. Well it's something like:
I don't understand how the calling application is supposed to deal with a DuplicateMessageError exception since it should not change either the Message-ID or the Date if present.
I see this as a major problem with any proposal to use only author headers in computing the "global id".
Or by using the global id, or by rejecting messages with duplicate
Er, the MTA has already accepted it. Do you plan to generate a list manager bounce to the poster? This has the unpleasant misfeature that it could be used to bounce spam off the list manager, since the poster needs to see content to determine whether this is a multiple send or actually the "intended version" after a "fat-finger" send; we already know the message-id isn't good enough.