[Spambayes] Beyond Spambayes
wsy at merl.com
Wed Feb 22 19:20:48 CET 2006
From: "Seth Goodman" <sethg at GoodmanAssociates.com>
By employing a variety of rejection tools (i.e. DNSBL's for the
connecting IP plus HELO name and rDNS heuristics), most of the load can
be rejected during the envelope phase of SMTP. For the ones that make
it past the envelope, it is still possible to do the remaining content
checks during the DATA phase and make the sender wait before confirming
acceptance with a 250 code. Many people argue that spammers often abuse
pipelining and dump the whole message after the DATA command then
disconnect, not waiting around for the acceptance. Any MTA behaving
that way can be added to a local DNSBL so you don't talk to them next
A problem is that with the rise of botnet armies, we're the majority
of spam actually coming from bots, not "bulletproof" servers or open
relays. That is, a majority of spam is identical spam (indicating it
was sent at the behest of one individual), but was sent from a large
number of different sources via different paths. In short, a
"perfect" RBL (one that had 100% perfect input and propagated it at
superluminal velocity) would still only get about 40% of the spammers.
Similarly, there are a number of heuristics that can catch this
type of spammer early: put in a delay after the connection request
before you send the banner. Anyone who doesn't wait for the end of
banner can be safely disconnected and blacklisted for the future. If
you want to perform a public service, tarpit them instead of merely
rejecting and blacklisting.
I was under the impression that a pipelining MTA doesn't care what happens
after the port opens successfully. In that case, tarpitting won't
matter; they're not waiting for the ACK packets.
It's all one big mess, if you ask me. :(
Adding an answerback at the end of DATA (like three-phase commit) would
have been a nice thing, but it's a little late for that.
More information about the SpamBayes