[Spambayes] RE: Watch out for digests...
Bill Yerazunis
wsy at merl.com
Fri Dec 12 16:11:26 EST 2003
From: "Seth Goodman" <nobody at spamcop.net>
[Robert Coe]
> What's a "hapax"?
A token that only appears once in one database (ham or spam) and not at all
in the other. At least that's my understanding of the definition.
Yep.
The full term is "hapax legumenon", from the greek meaning "counted once",
and it means a word seen only once in a corpus of text.
When you're trying to decode a "lost language", hapaxes are your
worst nightmare, as you really can't cross-check to see if your
believed translation of the word is right or not if the word only
occurs in one place.
-Bill Yerazunis
More information about the Spambayes
mailing list