[Spambayes] proposed changes to hammie & co.
Tim Stone - Four Stones Expressions
tim@fourstonesExpressions.com
Fri Nov 22 16:42:03 2002
Well, I've gone and done it... I've touched classifier code. Either my name
is now mud, or I really am a part of the community... lol
I added result cacheing to the _update_probability method in WordInfo (in
hammie-playground branch). I suspect that this will save a lot of time, maybe
commensurate with what Adam Huff demonstrated. I don't have a large enough
corpus to really benchmark this, though, and you'll definitely want to take a
good look to make sure I haven't goofed anything up. I certainly didn't
change any calculations...
On a related note... There ought to be some safeguard against division by zero
in the hamratio and spamratio calculations. The system shouldn't blow up with
a /0 exception, but just peacefully assume some default and go about its
business. That's because it's possible that this could be run when only spam
has been trained on (for example). Some (regular everyday) user may very well
make this mistake, which is most likely to occur immediately after
installation. A blow up this early will probably just result in them not
using it, assuming that it doesn't work. I'd have fixed it, but I have no
idea what the peaceful default should be...
- TimS
www.fourstonesExpressions.com
More information about the Spambayes
mailing list