[Email-SIG] API for email threading library?
janssen at parc.com
Thu Jan 5 18:55:41 CET 2012
Folks, I'm working on an implementation of RFC 5256 email threading,
designed so that it could fit as a submodule in the "email" package, if
such a think was ever seen to be useful.
I'd like to ask "the wisdom of the crowd" what they think an appropriate
interface to such a thing would be? The basic operation is that you
create a collection (type C) of email threads (type T) by passing a set
of messages (type M) to the constructor.
* Should M be required to be "email.message.Message", or perhaps some
less restrictive type, say "ThreadableMessageAPI"? All that's
strictly required is the ability to retrieve the Message-ID, Subject,
Date, References, and In-Reply-To fields.
* What operations should be possible on C? Some that come to mind:
* retrieve_thread (M or message-id) => T
* add_message (M) => T
* add_messages (set of M) => None
* remove_message (M or message-id) => T (or None) ?
* What's the interface for T? It's a tree with possible dummy nodes, so
a tuple of messages plus nested tuples would do it. What should the
nodes in the tree be? Normalized (see RFC 5256) Message-IDs?
* For large sets of threads (millions of messages) a persistence
mechanism would be useful. Should there be a standard interface to
such a mechanism, perhaps as class methods on C? If so, what should
it look like? Should the implementation contain a default persistent
subclass of C, based on sqlite3? What side-effects would persistence
requirements have on the other design considerations? For instance,
would you have to save the entire text of a message for each node?
Just the headers? Just some of the headers? Just the Message-ID?
Have at it! Advise away!
More information about the Email-SIG