[Spambayes] My adventures with Spambayes...

Don Chance chance at stsci.edu
Tue Aug 12 16:04:27 EDT 2003


Hi,

Yesterday, after reading about spambayes on slashdot, I downloaded it
and start playing around with it.  I have an IMAP account, so I tried
out imapfilter.py.  The following is a record of the hacks I had to 
apply to get it to work.  I am posting this in hope this is will be of
some use to those noble folks who are developing this code.

The first problem I ran into was:

> python2.2 imapfilter.py -b
SpamBayes IMAP Filter Alpha1, version 0.01 (May 2003),
using SpamBayes IMAP Filter Web Interface Alpha1, version 0.01
and engine SpamBayes Beta2, version 0.2 (July 2003).

Traceback (most recent call last):
  File "imapfilter.py", line 789, in ?
    run()
  File "imapfilter.py", line 740, in run
    pwd = options["imap", "password"][0]
IndexError: tuple index out of range

After adding the "-p" option, I was able to set things up from the web
page.

Next, I tried training the filter, but kept getting:

imaplib.error: APPEND command error: BAD ['Invalid date-time in Append 
command']

Added some print statements to see what was going on.  After staring
at the time string that caused the problem for a long time, I finally
figured out the what was causing the error. The last part of the time
was "+000" instead of "+0000". Added the following code to workaround 
the problem:

385,392d380
<         msg_time_list = str(msg_time).replace('"', '').strip().split()
<         if len(msg_time_list) > 2 and len(msg_time_list[2]) == 4:
<             msg_time = '"' + string.join(msg_time_list) + '0"'
<         if options["globals", "verbose"]:
<             print "folder name:", self.folder.name
<             print "flags:      ", flags
<             print "msg_time:   ", msg_time
<         

Next, I tried to classify my Inbox, but kept getting assertion errors
on "assert hamcount <= nham".  Added some print statements to see what
was going on:

> python2.2 imapfilter.py -c -p -v
.
. (verbose output deleted)
.
hamcount: 25
nham: 24.0
Traceback (most recent call last):
  File "imapfilter.py", line 786, in ?
    run()
  File "imapfilter.py", line 776, in run
    imap_filter.Filter()
  File "imapfilter.py", line 643, in Filter
    self.unsure_folder)
  File "imapfilter.py", line 565, in Filter
    evidence=True)
  File "/data/copland1/chance/python/site-packages/spambayes/classifier.py", line 223, in chi2_spamprob
    clues = self._getclues(wordstream)
  File "/data/copland1/chance/python/site-packages/spambayes/classifier.py", line 454, in _getclues
    prob = self.probability(record)
  File "/data/copland1/chance/python/site-packages/spambayes/classifier.py", line 310, in probability
    assert hamcount <= nham
AssertionError

Made the following modification to classifier.py to workaround the problem:

307,313c307
<         if options["globals", "verbose"]:
<             print "hamcount:", hamcount
<             print "nham:", nham
<         try:
<             assert hamcount <= nham
<         except:
<             hamcount = nham
---
>         assert hamcount <= nham


Tried to classify my Inbox again:

> python2.2 imapfilter.py -c -p -v
.
. (verbose output deleted)
.
Traceback (most recent call last):
  File "imapfilter.py", line 786, in ?
    run()
  File "imapfilter.py", line 776, in run
    imap_filter.Filter()
  File "imapfilter.py", line 643, in Filter
    self.unsure_folder)
  File "imapfilter.py", line 579, in Filter
    msg.Save()
  File "imapfilter.py", line 369, in Save
    data = _extract_fetch_data(response[1][0])
  File "imapfilter.py", line 157, in _extract_fetch_data
    mo = FETCH_RESPONSE_RE.match(response)
TypeError: expected string or buffer

Added code:
157,159d155
<     if options["globals", "verbose"]:
<         print "type(response):", type(response)
<         print "response:", response

and repeated the command, but the error did not recur.


_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Don Chance
Computer Sciences Corp.
Space Telescope Science Institute
3700 San Martin Dr.
Baltimore, MD 21218
410-338-4941
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/



More information about the Spambayes mailing list