Mailman 3 Re: [Mailman-Developers] Google Summer of Code: Integration of Search Code - Mailman-Developers

28 Mar 2012 · *do*


      On Mar 28, 2012, at 10:29 AM, Stephen J. Turnbull wrote:
...
The only tricky issue is that we *do* have to worry about message-ID
collisions of truly different messages and about messages without message
IDs, especially for converted historical archives.  So the API needs to be
able to deal with these issues, probably by returning a set or sequence of
messages.
Mailman 3 itself requires unique Message-IDs.  IIRC, the Mail Archive guys
found a very very low collision rate over millions of messages, and I think
all such cases were basically spam.  The LMTP runner doesn't yet reject
duplicates, but it should (LP: #967951).
s>I would guess she'll probably store messages in YY-MM/MSGID, or as git does
...
in "unpacked" XX/YYYYYYYY... format, where XX are the first two digits of the
hash ID, and YY... are the remaining ones).  But it could easily be backed by
an IMAP store or something more specialized; we don't really care as long as
it's object-ID-addressable.
Don't forget too that the LMTP runner automatically adds the X-Message-ID-Hash
header, which is a Base32 encoding of the SHA1 hash of the Message-ID contents
(without the angle brackets).  This hash could be used as well.
-Barry

Re: [Mailman-Developers] Google Summer of Code: Integration of Search Code

Barry Warsaw

tags

participants (2)