[Tutor] parsing email

Mike Hansen mhansen at cso.atmel.com
Wed Jan 28 14:03:32 EST 2004


Thanks Karl.

That's what I was looking for.

I'll also look at spambayes.

Mike

Karl Pflästerer wrote:

>On 28 Jan 2004, Mike Hansen <- mhansen at cso.atmel.com wrote:
>
>  
>
>>Getting attacked by MyDoom at our site, our NT Admin asked if we could
>>write a script to delete messages. In the time it would have taken us
>>    
>>
>
>It's nice to have an admin who gives you such freedom.
>
>  
>
>>thought I'd try to write a script for when this happens in the
>>future.(Another virus, another mess)
>>    
>>
>
>Right.  And don't forget the fun of learning how to do the task.
>
>  
>
>>Is there a method of the email parser that can retrieve the subject? 
>>    
>>
>
>I think you mix two different subjects: parsing the e-mail and fetching
>it.
>
>  
>
>>Is there a dictionary that has a key with the subject? Is using the
>>email parser the best way to do this?
>>    
>>
>
>IMO no.  If you use pop3 to fetch the e-mails use poplib.  Then fetch
>the header of the e-mail and look at the values; perhaps you could use
>TOP to fetch some of the first lines of an e-mail to filter on those
>first lines.  That will help decreasing traffic.  Alternatively use
>imaplib.
>
>If you have the whole e-mail already in your spool and you just want to
>decide if they are to be deleted you may want to look at
>http://spambayes.sourceforge.net/
>
>But if you want to use the email parser you can read the files and then
>there is indeed a dictionary with the header entries names as keys.
>
>Suppose I had a message in the file 64856.msg then I could parse it
>with:
>
>  
>
>>>>import email
>>>>msg = email.message_from_file(file("64856.msg"))
>>>>msg.keys()
>>>>        
>>>>
>['Received', 'X-Hamster-Info', 'From', 'To', 'Date', 'Subject', 'Message-ID', 'Return-Path']
>  
>
>
>Now I could use these keys to retrieve the values and decide if the
>e-mail is spam.
>
>
>   Karl
>  
>



More information about the Tutor mailing list