[Tutor] Script performance: Filtering email

Remco Gerlich scarblac@pino.selwerd.nl
Tue, 16 Jan 2001 10:51:54 +0100


On Tue, Jan 16, 2001 at 01:18:19AM -0800, Sheila King wrote:
> Strangely enough, these missing e-mails eventually appeared. Some of them,
> over an hour from the time they hit my web host's mail servers until they hit
> my mailbox. Some of them about a half hour. Some about seven minutes.

Maybe the script failed with some exception, and qmail noticed and sent
the mail again later?

I'm quite sure this is not a performance issue, but there may be a bug
somewhere.

> Could my little 26 line Python script be causing this? It doesn't seem that 45
> minutes to an hour is about normal for an email to go through on my web host.
> Performance there is usually very good, and I never saw it take that long
> before.
> 
> Please examine the script below, if you would, and let me know if I'm
> overlooking some performance issues. I'm a real Python newbie, so maybe I'm
> doing something really dumb.

> Here is my script:
> Code Sample:
> 
>  #! /big/dom/xdomain/Python-2.0/python
> 
>  import rfc822, sys, string, smtplib
> 
>  ##check if to: and cc: addresses are in my approved file
>  def goodmessage(addresseeslist, rcptlist):
>       for addr in addresseeslist:
>            for rcpt in rcptlist:
>                 if (upper(addr[1]) == upper(rcpt)):
>                      return 1
>       return 0

Are you sure you meant addr[1] there? If addresseeslist is simply a name
of mail addresses, you are comparing the second character of addr to
the whole recipient. I don't know what rfc822 does though, maybe addr is
something else.

This could be written:

def goodmessage(addresseeslist, rcptlist):
   for rcpt in rcptlist:
      if rcpt in addresseeslist:
          return 1
   return 0
   
(if addr[1] was an error)


>  ## Retrieving data ##
> 
>  ##uncomment the below two lines if message data
>  ##  is read from stdin
>  origheaders=rfc822.Message(sys.stdin, 0)
>  body = sys.stdin.read()
> 
>  ##uncomment the below three lines if message data
>  ## is read from a file
>  #infile=open("message6.txt", "r")
>  #origheaders=rfc822.Message(infile)
>  #body = infile.read()
> 
>  ## Setting forwarding e-mails, sender e-mail
>  ## and file with list of good recipients
>  fwdgood = "privateaddr@domain.com"
>  fwdbad = "spambucket@spamcop.net"
>  sender = "myemail@mydomain.com"
>  goodrcptfile = open("GoodToList.txt", "r")   ##textfile with good address
>                                               ## for To: and cc: fields
>  goodrcptlist = string.split(goodrcptfile.read())
>  goodrcptfile.close()
> 
>  addressees =[]
>  addressees = addressees + origheaders.getaddrlist("To")
>  addressees = addressees + origheaders.getaddrlist("Cc")
> 
>  mssg = string.join(origheaders.headers,"")
>  mssg = mssg + "\n" + string.join(body, "")
> 
>  server=smtplib.SMTP("mysmtp.com")
> 
>  if goodmessage(addressees, goodrcptlist):
>       server.sendmail(sender, fwdgood, mssg)
>  else:
>       server.sendmail(sender, fwdbad, mssg)
>  server.quit()

If the connection fails for some reason or other, an exception will
be raised.

Maybe you should wrap the whole thing in a try/except to log whatever
goes wrong, like

try:
   (the body of your script)
except:
   import sys
   f=open("somelogfile","w")
   print >> f, sys.exc_info()
   f.close()
   
   raise # Re-raise the exception so that the program doesn't quit silently
   
If that log file fills up, you know where to look. I can't really
say more from here.

-- 
Remco Gerlich