[spambayes-dev] NEWTRICKS
T. Alexander Popiel
popiel at wolfskeep.com
Fri Dec 26 14:09:23 EST 2003
In message: <uekur8mw8.fsf at boost-consulting.com>
David Abrahams <dave at boost-consulting.com> writes:
>
>Since "this file is for ideas that have or have not yet been tried",
>I'd love to know what constitutes "trying". Is there some official
>testing procedure or corpus we can test against? I'd like to know
>whether any change I make is worth proposing. Of course I can try it
>on my own databases of Ham and Spam first...
Heh. We just went through this question with Seth Goodman. Basic
summation of the last week or so of advice is: Grab the latest CVS
image, then read README-DEVEL.txt and incremental.HOWTO.txt. Lots
of good info in there. Collect your own ham & spam corpora, put
them into the appropriate directory structure, then run the testing
tools over them with different options/classifiers/tokenizers/whatnot.
Post results and enough explanation so that people can try to
replicate your results using their own corpora.
- Alex
More information about the spambayes-dev
mailing list