[spambayes-dev] Is max token length really 12

Tim Peters tim.one at comcast.net
Mon Sep 22 15:52:48 EDT 2003


[Tolkin, Steve]
> But when I look at the "Show spam clues for the
> current message" I did not see a token of length 12 "VIRMMK100NTS."

Everything in the quotes is part of the token, and there are 13 characters
there.

> (The trailing period that is there because this is at the end of a
> sentence.)  Is this token missed perhaps because it was all capital
> letters, or a mixture of letters and digits, or because it is
> immediately followed by a period?

The last.  spambayes doesn't distinguish between kinds of characters, except
to distinguish between whitespace (blank, tab, newline, carriage return) and
non-whitespace (everything else).  "VIRMMK100NTS." isn't ignored, but it
generates a synthesized

    skip:v 10

summary token instead.

> This token would be the strongest indication that email was spam.

So instead some other token was <wink>.




More information about the spambayes-dev mailing list