[Spambayes] Beta status checklist

Tim Peters tim.one at comcast.net
Thu Mar 20 16:54:38 EST 2003


> ...
> It seems (via a grep for 'nham' or 'nspam') like the only things that
> use nham and nspam are:
>
> * testing code (the user wouldn't be using this)
> * experimental_ham_spam_imbalance (off by default)
>
> If this is correct,

Nope, they enter into every probability calculation, via
Classifier.probability().  More, they have to.

I expect a real bug got hacked over instead of solved at the time these
int() calls got added to classifier.add_msg():

        if is_spam:
            self.nspam = int(self.nspam) + 1  # account for string nspam
        else:
            self.nham = int(self.nham) + 1   # account for string nham

That is, the database was hosed if these things were ever strings, or
someone hacked around a bad database integration in the wrong place.

Note that it's easy to show that nham and nspam must be ints, provided that
only methods of Classifier muck with a Classifier's instance variables.
Under the same assumption, no word's hamcount can exceed nham, or its
spamcount nspam.




More information about the Spambayes mailing list