Re: [Mailman-Developers] Need more information about the GSoC idea 'GitLab/development tools integration'
On 3/12/16 4:19 PM, Wasim Thabraze wrote:
Hello everyone,
The GSoC idea states 'a tool that will take a thread from a Mailman mailing list and..'. I just wanted to know what are the different ways to extract a thread from a mailing list.
I'm actually trying to understand the idea from a couple of days but in vain. I couldn't even find documentations related to thread extraction.
Does a thread has any unique id? If yes, how should I extract the thread using the id?
Any links that would help me in understanding it in a much more better way?
Awaiting reply.
Thank You
Regards, Wasim
The normal definition of the 'Thread' is a chain of message linked by In-Reply-To: and References: headers to Reference-Id: headers.
Every email message has a unique Reference-Id: header to identify it.
When a person replies to that message, their email program is supposed to add a In-Reply-To: header with the Reference-Id of the message it is a reply to, or a References: header with a copy of the References: header of the message being replied to with the Reference-Id of the replied to message added at the beginning (and possibly ones at the end trimmed if the header gets too long). This chain defines a 'Thread'.
One wrinkle is with plain text digest users, they tend to not have the Reference-Id of the message they are replying to (one of the limitations of the plain text digest) so they tend to break threads. The one big issue here is that they also often send a message by replying to the digest, so two messages that refer to the same message in In-Reply-To: or References: headers might not want to be linked together as a thread if the common Message-Id isn't present (it is likely the digest). I think this is what the current mailman 2 archives do.
On 03/12/2016 01:44 PM, Richard Damon wrote:
The normal definition of the 'Thread' is a chain of message linked by In-Reply-To: and References: headers to Reference-Id: headers.
Every email message has a unique Reference-Id: header to identify it.
Richard's answer is generally good, but the name of the header is Message-Id:, not Reference-Id:
The definitive reference on threading is <https://www.jwz.org/doc/threading.html>
Some mail clients (mostly Microsoft ones I think) put a Thread-Index: header in messages, but for various reasons this is not useful for determining threading on an email list with posts from many sources.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
The definitive reference on threading is <https://www.jwz.org/doc/threading.html>
There's also RFC 5256, based on Jamie's page, which defines threading in the context of IMAP, and gives a concise account of the algorithm.
participants (3)
-
Mark Sapiro
-
Richard Damon
-
Stephen J. Turnbull