[Tutor] Script performance: Filtering email

Sheila King sheila@thinkspot.net
Tue, 16 Jan 2001 01:18:19 -0800


I'd appreciate any analysis or tips on this task I'm trying to accomplish, and
my actual Python code (which is at the bottom of this message).

So, I'm writing a script to filter my e-mail. I'm using Python, but I'd assume
performance would be more or less equivalent to a script in Perl. I tested it
yesterday afternoon. Seemed fine.

My host uses qmail as an MTA. I can invoke my python script using .qmail files
each time an e-mail arrives. My plan: replace my .qmail default file, with a
new version, that calls my script. Except for a few, well-chosen aliases, that
have their own .qmail files, that skip over the script.

I started using it for a while tonight. Further testing. It all seemed fine.
What it does: checks for certain headers in the e-mail to determine whether it
is a "good" email or a "bad" email. If good, it sends it to a private e-mail
address that I have never published or shared with anyone. Otherwise, the mail
is sent to a spam-bucket address. Like I say, seemed fine for a while.

Then, late last night (after a few hours of using the script), my e-mails
seemed to be disappearing into the void. I panicked, and took the script down.
Before taking it down, I sent myself several test mails. Both ones that would
qualify as "good" mails and as "bad" ones. None seemed to be delivered. So, I
took my filter script down. Tested if mails could now go through. They did.
Just fine.

Strangely enough, these missing e-mails eventually appeared. Some of them,
over an hour from the time they hit my web host's mail servers until they hit
my mailbox. Some of them about a half hour. Some about seven minutes.

Could my little 26 line Python script be causing this? It doesn't seem that 45
minutes to an hour is about normal for an email to go through on my web host.
Performance there is usually very good, and I never saw it take that long
before.

Please examine the script below, if you would, and let me know if I'm
overlooking some performance issues. I'm a real Python newbie, so maybe I'm
doing something really dumb.

Thanks,

--
Sheila King
http://www.thinkspot.net/sheila/
http://www.k12groups.org/


Here is my script:
Code Sample:

 #! /big/dom/xdomain/Python-2.0/python

 import rfc822, sys, string, smtplib

 ##check if to: and cc: addresses are in my approved file
 def goodmessage(addresseeslist, rcptlist):
      for addr in addresseeslist:
           for rcpt in rcptlist:
                if (upper(addr[1]) == upper(rcpt)):
                     return 1
      return 0

 ## Retrieving data ##

 ##uncomment the below two lines if message data
 ##  is read from stdin
 origheaders=rfc822.Message(sys.stdin, 0)
 body = sys.stdin.read()

 ##uncomment the below three lines if message data
 ## is read from a file
 #infile=open("message6.txt", "r")
 #origheaders=rfc822.Message(infile)
 #body = infile.read()

 ## Setting forwarding e-mails, sender e-mail
 ## and file with list of good recipients
 fwdgood = "privateaddr@domain.com"
 fwdbad = "spambucket@spamcop.net"
 sender = "myemail@mydomain.com"
 goodrcptfile = open("GoodToList.txt", "r")   ##textfile with good address
                                              ## for To: and cc: fields
 goodrcptlist = string.split(goodrcptfile.read())
 goodrcptfile.close()

 addressees =[]
 addressees = addressees + origheaders.getaddrlist("To")
 addressees = addressees + origheaders.getaddrlist("Cc")

 mssg = string.join(origheaders.headers,"")
 mssg = mssg + "\n" + string.join(body, "")

 server=smtplib.SMTP("mysmtp.com")

 if goodmessage(addressees, goodrcptlist):
      server.sendmail(sender, fwdgood, mssg)
 else:
      server.sendmail(sender, fwdbad, mssg)
 server.quit()