[Spambayes] [spambayes-dev] Updated test results
Matthew Dixon Cowles
matt at mondoinfo.com
Tue Aug 8 21:37:02 CEST 2006
> baseline vs. x-lookup_ip:
[. . .]
> false negative percentages
> 2.228 1.671 won -25.00%
> 3.343 3.064 won -8.35%
> 5.292 4.735 won -10.53%
> 4.735 4.457 won -5.87%
> 2.786 2.507 won -10.01%
>
> won 5 times
> tied 0 times
> lost 0 times
I'm glad to see that. That's the sort of improvement that I see with
that code, but I think it's the first time that anyone else has
reproduced it.
Still, as people have pointed out before, there's at least one
potential problem in the code. That's that data from DNS isn't
necessarily stable. If someone needed to un-train their database on a
message a day or two later, the tokens generated might easily not be
the same as they were when the message was first trained on. That
could send a token's count below zero.
That doesn't affect me in practice, but it would surely affect
someone if the code were used widely. Fixing it in general would
require some rather elaborate persistence mechanism, I think.
Regards,
Matt
More information about the SpamBayes
mailing list