[Spambayes] RE: Yahoo's "domain keys" and spam

Fri Dec 12 16:12:51 EST 2003

> From: Robert K. Coe

> Where is it written that SpamBayes would "add a token 
> verifying the presence of a domain key entry in the header"? 

It's not, although the current SpamBayes header tokenizer would detect
this header and add the same token no matter what the actual validity of
the signature was. It would look something like "X-Yahoo-Domain-Sig:
skip:20".

> Doing so would seem to be self-defeating if credible 
> forgeries of domain keys become widespread 

*Credible* forgeries will not be widespread, since you need the
organization's private key to generate the domain signature. However,
unless the logic to actually perform the crypto and verify the signature
is added to SpamBayes, all SB will see is a header token with a bunch of
random hex characters. That is, only the *presence* of a domain
signature would be detected, not whether or not it is actually valid.

> marginally helpful otherwise (since the "I know it when I see 
> it" model of spam detection works very well for humans and 
> fairly well for Bayesian filters without this additional 
> complication).

A validated domain key signature would tell you with certainty whether
or not a mail message that claims to be from a domain is actually from
an SMTP server controlled by the same people who control the domain's
DNS. My intuition tells me this is a very strong bit of evidence that
would be very useful to the classifier. Of course, that will have to be
tested before the feature would be added to the production code.

> For that matter, where is it written that the use of domain 
> keys will become widespread? (But I guess that's a topic for 
> another discussion.)

Yahoo is one of the biggest mail hosts on the Internet, and they're
donating the code to the most popular Open-source SMTP servers in use on
the Internet. It's supposed to be a cheap and simple addition to the DNS
and e-mail infrastructure, and Yahoo is evangelizing it to other major
ISPs as a way to cut down on forged spam message headers. I assume a lot
of smaller ISPs and corporations will get on board as a result, but of
course it may just be ignored. There will also be a significant ramp-up
time.

I'm merely suggesting that the SB project get out in front of the issue.
I will try adapt the necessary crypto code into Python myself as soon as
it is available in other open-source projects. (I'm guessing the code
will be in C, and go to the Qmail, sendmail, etc. projects).

Regards,
	-Ryan-