[spambayes-dev] NEWTRICKS

T. Alexander Popiel popiel at wolfskeep.com
Fri Dec 26 14:09:23 EST 2003


In message:  <uekur8mw8.fsf at boost-consulting.com>
             David Abrahams <dave at boost-consulting.com> writes:
>
>Since "this file is for ideas that have or have not yet been tried",
>I'd love to know what constitutes "trying".  Is there some official
>testing procedure or corpus we can test against?  I'd like to know
>whether any change I make is worth proposing.  Of course I can try it
>on my own databases of Ham and Spam first...

Heh.  We just went through this question with Seth Goodman.  Basic
summation of the last week or so of advice is: Grab the latest CVS
image, then read README-DEVEL.txt and incremental.HOWTO.txt.  Lots
of good info in there.  Collect your own ham & spam corpora, put
them into the appropriate directory structure, then run the testing
tools over them with different options/classifiers/tokenizers/whatnot.
Post results and enough explanation so that people can try to
replicate your results using their own corpora.

- Alex



More information about the spambayes-dev mailing list