[Tutor] Re: Parsing the contents of an email message.

Jorge Godoy godoy at ieee.org
Wed Apr 21 15:35:43 EDT 2004


On Qua 21 Abr 2004 15:59, tpc at csua.berkeley.edu wrote:

> 
> hi Jorge, I am finishing up an application that grabs emails, passes them
> through a filter, and parses them.  I ended up using:
> 
> import MySQLdb, re
> from imaplib import IMAP4_SSL
> from email import message_from_string
> from email.Utils import parsedate_tz, mktime_tz
> from time import strftime, strptime, localtime, time
> 
> As you will see, dateTime is very important when parsing emails,
> especially when debugging your application by writing an errorlog of
> emails that didn't parse correctly.

Thanks.

I'm more interested in the contents of the message as I said and I'm
planning on dealing with them sequentially as they arrive (using a Maildir
or something like that and sorting the list of files by their creation
date). Those files will be removed after successful processing them or the
processing would stop at the defective file, so that it's possible to debug
the problem.

I ended up with something like that to get the contents of the (5) samples I
had here:

---------------------------------------------------------------------
import email.Parser
import sys

message = open(sys.argv[1], 'r')
parsed_message = email.message_from_file(message)
if parsed_message.is_multipart() is True:
    print parsed_message.get_payload(0)
else:
    print "It is not a multipart message"
    print parsed_message.get_payload()
---------------------------------------------------------------------

The "else" clause is there only for safety, since I found out that the
messages I'm interested in are all MIME wrapped (but it works with other
messages --- I tested it on a 154 messages Maildir).


I'll adapt it to the program I'm writing. 


Now it's just a matter of parsing the contents of the payload. ;-)


Thanks,
-- 
Godoy.      <godoy at ieee.org>




More information about the Tutor mailing list