[Spambayes] Latest spammer trick stymied - QUESTION

Richard Jowsey richard at jowsey.com
Fri Apr 4 12:48:26 EST 2003

> I'm sure that any ideas can be posted.  And if someone isn't able to
> code something to do some testing, then they really should post a
> feature request.  

The url slurper *has* been coded, but in Java, or I'd have definitely 
donated source to the Pythonic SpamBayes project for further 
evaluation. I'd emphasize that this concept has been "real-world 
tested" by a number of my beta users, who're apparently quite 
delighted that these annoying micro-spams are now being accurately 
classified. As am I. "Unsures" have completely disappeared in cases 
where the message sender is unknown, the email has very few useful 
clues, and it contains only a single URL. Those are the only test 
results that matter to me... <wink>

> Although the consensus has weighed in against the URL-following
> technique (in general, at least), I'm sure it hasn't weighed in
> against the discussion itself.

The discussion has been quite healthy, IMO, but I can't quite agree 
with your "consensus" conclusion. Several people did express 
(constructive) concerns about the security/safety of such a 
capability, and other contributors have, I think, allayed or 
circumscribed those fears. A couple of people also questioned the 
usefulness of URL-retrieval, given there aren't too many of these 
spasms, and they'll probably disappear soon enough. A fair comment, I 
thought. As for consensus, I'm biased, but I don't really think there 
is one.

Bear in mind, we're talking about a real "outer limits" tweak here, 
specifically for those quite rare cases where the classifier just 
doesn't have enough data to make up its mind. Understandably, many 
people aren't at all concerned about squeezing yet another 0.01% of 
accuracy out of the beastie; it's already good enough. In fact, most 
people could probably care less! I only mentioned the idea for those 
of us who demand absolute 100% perfection... <grin>

Apologies if I've inadvertently gaffed etiquette by not posting code 
and/or a Feature Request. I'm a bad dog, and I promise to behave 
better in future!  ;-)


More information about the Spambayes mailing list