[spambayes-dev] spammy subject lines

Tim Peters tim.one at comcast.net
Mon Oct 13 11:13:48 EDT 2003


[Paul Sorenson]
>>> and my "cleaned up" subject token doesn't appear to get through -
>>> Ie I don't think it is working as I expect.

>> Why do you think that?  We can't see what you did, and you didn't
>> spell out your evidence.

> Well I sent myself a couple of emails with subject lines differing
> only by punctuation and checked the clues in the web interface.
> Words stripped of punctuation from the subject line didn't seem to
> appear in the list.

Was that a list of *all* tokens, or just a list of "significant" tokens?  I
don't use the web interface, so I'm not familiar with it.  By default,
spambayes only pays attention to tokens with a spamprob less than 0.4 or
greater than 0.6, and most "display the tokens" gimmicks have defaulted to
showing only the tokens that went into the score computation (== only those
with "strong enough" spamprobs).  If you made up some obfscuated Subject
lines, it's quite possible that the unobfuscated words simply didn't show up
in your training data often enough to get a spamprob significant enough to
show.  I judge stuff like this by looking at the Outlook addin's spam-clue
report's "All Message Tokens" section.

