[Spambayes] FYI: Java implementation

Richard Jowsey richard at jowsey.com
Wed Jan 22 16:28:56 EST 2003

> > That chi2 test is definitely on
> > the drawing boards, even if only for comparison purposes...
> Anthony Baxter has some plots of score distributions for
> Graham-combining, Gary-combining and chi-combining here:
>   http://spambayes.sourceforge.net/background.html

Damn nice graphics! And a good explanation for the advantages of 
the chi-squared "combining and scoring" treatment. OK, so I'm a 
believer! :-)

> It's the sharpness and spread of the separation in chi- that's
> attractive.

Indeed! I've now mostly finished my core word-tokenization and 
training logic, and am presently running sweeps across my 
good/spam corpus to complete populating the database. I'll be re-
working the comparator classes presently, to incorporate this 
chi-2 math. Will keep everyone posted as to progress...


"Once the number three, being the third number, be reached, then 
lobbest thou thy Holy Hand Grenade of Antioch towards thou foe, 
who being naughty in my sight, shall snuff it!"

