[spambayes-dev] Who wants to pretend to be a spammer?

Tue Dec 16 15:46:08 EST 2003

In message:  <opr0aiplr5it6vze at mail.fourstonesExpressions.com>
             Tim Stone <tim at fourstonesExpressions.com> writes:
>On Tue, 16 Dec 2003 15:13:11 -0500, Kenny Pitt <kennypitt at hotmail.com>
>wrote:
>
>> Interesting idea, but wouldn't it be tricky to make your psuedo-spams
>> representative of real-world spam patterns?  For example, it seems like
>> whatever e-mail address and/or SMTP server you use to send the messages
>> would quickly become a significant spam clue.
>
>Yeah, those could be some challenges.  I'm not convinced of the
>usefullness of the idea, but it *could* give us a leg up on spam as it
>evolves.  I dunno, maybe it can't evolve fast enough to fool us for long,
>but...

Those would be the same challenges that the initial testing had
with the multi-source corpora (where significant spam all came
from one source and significant ham all came for a different place)...
which is why headers were almost completely ignored for the first
six months or so of development.

A good first approximation of returning to that would be to turn
off all the from/to/received/msgid header parsing.

Responding to the idea (someone emulating a spammer): wouldn't it
be easier to just distribute a corpus of spam, and have people
grab it and test it against their databases?

- Alex