[spambayes-dev] Results for DNS lookup in tokenizer

Sun Apr 11 12:28:49 EDT 2004

Dear Skip,

> Unless those messages are extremely short, I doubt it would matter
> much. It's going to be just one clue among many.  I have no trouble
> getting the occasional good mail from the pychecker mailing list,
> which gets almost nothing but spam these days.

Thanks for the clue. I'll give it a try.

>> It seems that it's easier for a spammer to find a compromised PC
>> to relay though than it is for them to find someone willing to
>> host their site.

> In which case I doubt either of these network ip classification
> schemes will have much effect.

Sorry for not being clear. What I should have mentioned earlier is
that it doesn't seem to me that an unusual amount of spam comes from
the networks that host spammers' websites. I don't think that
mine_received_headers and the scheme I'm testing will generate much
of the same data.

In the last 24 hours, I've had 29 spams for which SpamBayes's
classifier used as evidence URL's IPs in 202/8, 218/8, 219/8, and
221/8. On the ham side, the IP for mail.python.org has figured in
evidence for 15 hams.

Spammers seem to be limited in their choice of networks for hosting,
but they can't know what networks the URLs that you or I get in ham
messages will resolve to. In that respect, those IPs fit well with
what SpamBayes does: spammers have a constrained spam "vocabulary"
and can't know a random individual's limited ham "vocabulary".

Regards,
Matt