[spambayes-dev] spammy subject lines

Paul Sorenson sourceforge at metrak.com
Tue Oct 14 04:58:31 EDT 2003


----- Original Message -----
From: "Tim Peters" <tim.one at comcast.net>
To: "Paul Sorenson" <sourceforge at metrak.com>
Cc: "spambayes-dev" <spambayes-dev at python.org>
Sent: Tuesday, October 14, 2003 1:13 AM
Subject: RE: [spambayes-dev] spammy subject lines


> [Paul]
> > Well I sent myself a couple of emails with subject lines differing
> > only by punctuation and checked the clues in the web interface.
> > Words stripped of punctuation from the subject line didn't seem to
> > appear in the list.
>
> Was that a list of *all* tokens, or just a list of "significant" tokens?
I
> don't use the web interface, so I'm not familiar with it.  By default,
> spambayes only pays attention to tokens with a spamprob less than 0.4 or
> greater than 0.6, and most "display the tokens" gimmicks have defaulted to
> showing only the tokens that went into the score computation (== only
those
> with "strong enough" spamprobs).  If you made up some obfscuated Subject
> lines, it's quite possible that the unobfuscated words simply didn't show
up
> in your training data often enough to get a spamprob significant enough to
> show.  I judge stuff like this by looking at the Outlook addin's spam-clue
> report's "All Message Tokens" section.

Probably some subset - however I had a "control" email which had the same
subject without the punctuation and it showed up in the list.

Someone earlier mentioned spam fads, I would be interested to see how this
one (punctuated words) pans out (fad vs trend).  One possibility is that
spammers are doing this in an attempt to trick spam filters such as
spamassasin which are being deployed with greater frequency at the ISP
level.




More information about the spambayes-dev mailing list