[Mailman-Developers] From the creation of a ThreadID

Stephen J. Turnbull stephen at xemacs.org
Thu Apr 5 17:10:22 CEST 2012


On Thu, Apr 5, 2012 at 10:41 PM, Pierre-Yves Chibon <pingou at pingoured.fr> wrote:

> In HyperKitty to be able to easily retrieve from the database all the
> threads of a given month or just all the emails of a thread, I created a
> Field in the database called ThreadID.
> When I load the archives from mailman into mongo, I look for the absence
> of the headers 'References' or 'In-Reply-To' to define an email that
> starts a new thread.

This fails when a thread crosses channels.  Eg,

To: Pierre
From: Steve
Message-Id: <x at y.z>

is followed by

To: Steve
From: Pierre
Cc: SomeList
References: <x at y.z>
Message-Id: <a at b.c>

> Would anyone have an idea on how to generate a stable and delete/reload
> proof ThreadID?

I don't see how this can be possible.  Eg, in the above scenario you
construct a thread based on your reply to me.  Then I go, "oh, really
I should have posted to mm-dev" and repost the thread.  So the
"Message-ID of root message" fails, and I don't see an alternative
that can be predicted.  So it may as well be arbitrary (eg, any
message in the thread) and stored in the database with appropriate
linkage from thread IDs to message IDs (one-to-many), and vice versa
(many-to-one).

> The other solution of course being that I regenerate the thread on the
> fly based on the first email (which is still easy to find), but that
> will be a lot of db querying.

I haven't thought about it deeply, but I would say just give the
thread an arbitrary ID in the database.  Message-IDs are supposed to
universally unique, so what's wrong with keeping the thread in the
database as a tree of message IDs?  Some Message-IDs will not have
corresponding messages but that's always a problem with threading (see
http://www.jwz.org/doc/threading.html, and RFC 5256).

There are other problems with threading that need to be dealt with as
well, such as References being inconsistent across messages in the
same thread and people who continue a thread with a new message, etc.


More information about the Mailman-Developers mailing list