[Spambayes] RE: Watch out for digests...

Bill Yerazunis wsy at merl.com
Fri Dec 12 16:11:26 EST 2003


   From: "Seth Goodman" <nobody at spamcop.net>

   [Robert Coe]
   > What's a "hapax"?

   A token that only appears once in one database (ham or spam) and not at all
   in the other.  At least that's my understanding of the definition.


Yep.

The full term is "hapax legumenon", from the greek meaning "counted once",
and it means a word seen only once in a corpus of text.

When you're trying to decode a "lost language", hapaxes are your 
worst nightmare, as you really can't cross-check to see if your
believed translation of the word is right or not if the word only
occurs in one place.

       -Bill Yerazunis



More information about the Spambayes mailing list