[Spambayes] How low can you go?
Tony Meyer
tameyer at ihug.co.nz
Thu Dec 18 03:08:43 EST 2003
[Tim]
> I see that it's a cruder approximation to the suggested
> scoring algorithm (which I implemented at one time).
[...]
> It's harder to code a tiling method;
Exactly <wink>.
> BTW, it should *not* be necessary to increase
> max_discriminators, and doing so can create subtle numeric
> problems in the inverse chi-squared function.
> Without this option, in an N-token message, N tokens were
> candidates for scoring; with this option, there are still
> exactly N candidates for scoring; with a true tiling
> implementation, there are no more than N
> candidates for scoring (and usually less than N).
So the comment in here:
<http://mail.python.org/pipermail/spambayes-dev/2003-September/001005.html>
Is only referring to cases where both unigrams *and* bigrams are used,
rather than the tiling (or crude approximation) is used?
I did get improvements with a higher max_discriminators:
<http://mail.python.org/pipermail/spambayes-dev/2003-September/001018.html>
Is that likely to be just a side-effect of the crudeness of my
approximation?
=Tony Meyer
More information about the Spambayes
mailing list