Jeff Breidenbach writes:
Notice that of 325146 total messages, 624 of them had no message-id header. Even if you aggregate dup+col, you're still looking at a total duplicate rate of 0.29%.
Message ID's are supposed to be unique.
Fortunately, a rule more honored in the observance than the breach. Nonetheless, it *is* breached. The Postel Principle applies here, IMO.
better to go ahead and use the mesage-id, rather than concoct yet another "this time we mean it!" unique identifier.
That's not the point. We're not going to impose this on senders; that's what Message-ID is for, as you say. If a sender won't provide a proper Message-ID, third parties who get a CC are just out of luck.
I simply think we should be prepared for applications where relying on the sender to supply a UUID is not acceptable; we need to be able to provide one ourselves. Creating UUIDs is a solved problem, after all. So we just specify a header to put it in, and subscribers will be able to use it, per definition of a canonical URL.
Then we say that an archive SHOULD provide access to the resource via Message-ID if available, and define how to construct that URL from the List-Archive and Message-ID headers.
Which brings me to suggestion #2, which is go ahead and write an RFC on how list servers should embed archival links in messages.
I think Barry already suggested that? Anyway, +1. But remember, a standards-track RFC should have a working implementation to point to.