[Spambayes] full o' spaces

Sat Mar 8 19:45:00 EST 2003

>>> Tim Stone replying to Neil Schemenauer
> >  Adding more code every time a spammer
> >comes up with a new trick is completely reactionary and will eventually
> >destroy the code base.  I'm mystified as to how you can call such an
> >approach proactive.
> 
> Again, I was suggesting that we find the holes before they do. I think
> that we should begin to think like spammers, not like people trying to
> defeat spammers. If we were on the other side, what would we do? Gosh,
> I can think of things, simple things. And if I can find something
> that actually crashes the tokenizer, all the better. I'll look at the
> code, more closely than most on this team ever will. I'll find the
> holes, and blast away. My goal? Not to get spam into mailboxes, but to
> destroy the anti-spam community. Make people give up hope that this
> problem really is/can be solved. That's the way to make you and me go
> away. Simply make it so people don't believe in us.

We're not talking about something that crashes the tokenizer. We're 
talking about a new spam technique that's been seen in a very small 
number of live spams. I've not yet seen one of these, and I get an
absolute shiteload of spam every day. Note also that a lot of people
run spamassassin, and it's absolute death on this technique (called
"gappy text", from memory). The chances of this technique surviving
very long is very small.

We can sit here for days, weeks and months and think of ways to defeat
the existing classifier. We have done that, in the past. But a change that
is not tested and shown to improve existing results, does _not_ belong 
in the code base. It goes against _everything_ that has made this project 
successful. 

Sure - if you find a way to actually crash the tokeniser, then the fix
should go in. But "what if"ing serves no use, and may make things worse.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>   
It's never too late to have a happy childhood.