[Spambayes] RE: Watch out for digests...

Robert K. Coe bob at 1776.com
Fri Dec 12 13:20:46 EST 2003


What's a "hapax"?

> -----Original Message-----
> From: Skip Montanaro [mailto:skip at pobox.com]
> Sent: Wednesday, December 10, 2003 9:23 PM
> To: Tony Meyer
> Cc: spambayes at python.org; spambayes-dev at python.org
> Subject: RE: [Spambayes] Watch out for digests...
> 
> 
>     >> Big mistake. Stuff started getting wacky real fast.... Guess what?
>     >> One of the messages in the digest was an obvious spam.
> 
>     Tony> This is perhaps a drawback of the minimalist database size
>     Tony> training strategy.  I'm guessing that if you had a larger
>     Tony> database, the effect wouldn't have been as pronounced?  
> 
> Maybe.  At the moment, I have 9768 tokens in my database and 7731 of them
> are hapaxes.  As you suggest, it would appear mistakes can throw things off
> more dramatically, but it is also easier to detect.
> 
> I'd be interested to see what others' hapax fractions are:
> 
> ...




More information about the Spambayes mailing list