[Tutor] parsing email
Karl Pflästerer
sigurd at 12move.de
Wed Jan 28 13:41:37 EST 2004
On 28 Jan 2004, Mike Hansen <- mhansen at cso.atmel.com wrote:
> Getting attacked by MyDoom at our site, our NT Admin asked if we could
> write a script to delete messages. In the time it would have taken us
It's nice to have an admin who gives you such freedom.
> thought I'd try to write a script for when this happens in the
> future.(Another virus, another mess)
Right. And don't forget the fun of learning how to do the task.
> Is there a method of the email parser that can retrieve the subject?
I think you mix two different subjects: parsing the e-mail and fetching
it.
> Is there a dictionary that has a key with the subject? Is using the
> email parser the best way to do this?
IMO no. If you use pop3 to fetch the e-mails use poplib. Then fetch
the header of the e-mail and look at the values; perhaps you could use
TOP to fetch some of the first lines of an e-mail to filter on those
first lines. That will help decreasing traffic. Alternatively use
imaplib.
If you have the whole e-mail already in your spool and you just want to
decide if they are to be deleted you may want to look at
http://spambayes.sourceforge.net/
But if you want to use the email parser you can read the files and then
there is indeed a dictionary with the header entries names as keys.
Suppose I had a message in the file 64856.msg then I could parse it
with:
>>> import email
>>> msg = email.message_from_file(file("64856.msg"))
>>> msg.keys()
['Received', 'X-Hamster-Info', 'From', 'To', 'Date', 'Subject', 'Message-ID', 'Return-Path']
>>>
Now I could use these keys to retrieve the values and decide if the
e-mail is spam.
Karl
--
Please do *not* send copies of replies to me.
I read the list
More information about the Tutor
mailing list