[Tutor] How to parse a mailing list thread?
Cameron Simpson
cs at zip.com.au
Sun Sep 20 07:41:31 CEST 2015
On 19Sep2015 21:46, chandan kumar <chandankumar.093047 at gmail.com> wrote:
>I am looking for a python module which i can use to parse mailing thread
>and extract some information from it.
>
>Any pointer regarding that would be helpful.
You should describe where the email messages are stored. I'll presume you have
obtained the messages.
Construct a Message object from each message text. See the email.message
module:
https://docs.python.org/3/library/email.message.html#module-email.message
Every message has a Message-ID: header which uniquely identifies it. Replies to
that message have that id in the In_Reply-To: header. (If you're parsing usenet
newsgroup messages, you want the References: header - personally I consult
both.)
The complete specification of an email message is here:
http://tools.ietf.org/html/rfc2822
and the email.message module (and the other email.* modules) makes most of it
easily available. If you need to parse email addresses import the
"getaddresses" function from the "email.utils" module.
Constuct a graph connecting messages with the replies. You're done!
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Tutor
mailing list