[Spambayes] Two amusing spam clues

Rob W.W. Hooft rob at hooft.net
Tue Jan 7 09:18:44 EST 2003

Neil Schemenauer wrote:
> For an email I just received:
>   'header:Reply-To:1' 0.815893146633
>   'message-id:@murphy.debian.org' 0.997094899935
> The first one surprised me.  It looks like most spam provides a reply-to
> header that is the same as the from header.  I have not idea why they do
> that.  The second one is spammy because I get a fair amount of spam
> through my debian.org address.  I guess a lot of spam doesn't have a
> message ID so the Debian mail server adds one.
> The moral of the story is that statistical filters are good at picking
> up on clues that humans might miss.  I don't about other people on this
> list but my spambayes filter is kicking spammer ass.  I very rarely see
> a FN and even more rarely see a FP.  Props to everyone who helped out
> with development.

I just retrained on the latest batch for me: 448 messages classified as 
ham, 3 of these were fn. 122 messages classified as spam, no fp. 18 
messages classified as unsure, all of these were spam. I could have 
reduced the number of unsures to 11 retrospectively by using the default 
spam cutoff of 0.90.

Lowest scoring spam: 0.11, highest scoring ham: 0.01

spambayes really is very good now. If someone could find time, we should 
make a release! It is awfully quiet on this list lately....


