[Tutor] parsing email

Karl Pflästerer sigurd at 12move.de
Wed Jan 28 13:41:37 EST 2004


On 28 Jan 2004, Mike Hansen <- mhansen at cso.atmel.com wrote:

> Getting attacked by MyDoom at our site, our NT Admin asked if we could
> write a script to delete messages. In the time it would have taken us

It's nice to have an admin who gives you such freedom.

> thought I'd try to write a script for when this happens in the
> future.(Another virus, another mess)

Right.  And don't forget the fun of learning how to do the task.

> Is there a method of the email parser that can retrieve the subject? 

I think you mix two different subjects: parsing the e-mail and fetching
it.

> Is there a dictionary that has a key with the subject? Is using the
> email parser the best way to do this?

IMO no.  If you use pop3 to fetch the e-mails use poplib.  Then fetch
the header of the e-mail and look at the values; perhaps you could use
TOP to fetch some of the first lines of an e-mail to filter on those
first lines.  That will help decreasing traffic.  Alternatively use
imaplib.

If you have the whole e-mail already in your spool and you just want to
decide if they are to be deleted you may want to look at
http://spambayes.sourceforge.net/

But if you want to use the email parser you can read the files and then
there is indeed a dictionary with the header entries names as keys.

Suppose I had a message in the file 64856.msg then I could parse it
with:

>>> import email
>>> msg = email.message_from_file(file("64856.msg"))
>>> msg.keys()
['Received', 'X-Hamster-Info', 'From', 'To', 'Date', 'Subject', 'Message-ID', 'Return-Path']
>>> 

Now I could use these keys to retrieve the values and decide if the
e-mail is spam.


   Karl
-- 
Please do *not* send copies of replies to me.
I read the list




More information about the Tutor mailing list