[Spambayes] Just for fun

Mon Nov 18 10:40:25 2002

Derek Simkowiak said the following on 15/11/02 17:35:
>>Remember that this project /is/ the first instance of a decent spam filter :),
>>so we can hardly blame the spammers for being a little behind.
> 
> 	Let's not forget that SpamBayes only works for individuals or
> workgroups who have the same definitation of "ham".  It doesn't help much
> in enterprise-level settings with tens of thousands of users, since the
> ham of such a large and varied group of people would dilute the definition
> of spam too much to be useful.

I think you're over-exhagerating. It most certainly *does* help, and it 
helps a lot. For a large diverse group of users statistical analysis is 
still about 90% correct. It's not the 99.9% correct you get for an 
individual's mailbox, but as part of the bigger picture (involving 
statistics, rules, DNSBL's, etc) it's a huge help.

> 	I bet that playing the numbers game one could "show" that the
> helpdesk and maintenance costs of supporting a Python installation plus a
> per-person ham training procedure would be more expensive (for a Uni or
> Mega-Corp.) than just living with spam.  (Pure conjecture on my part, but
> it is easily imagined.)

Depends how you calculate the cost of spam. For me it's an interuption, 
and for my work (which involves intense periods of coding, maths, and 
reading) an interuption means I have to start again a lot of the time. 
For a highly paid programmer that cost could be about £20. Per spam.

And ignoring my email is often not an option: I have to support a spam 
solution for over a million users.

> 	There's another Python-based spam filter that might work better
> for SMTP server-wide deployment, called "Active Spam Killer", or ASK.
> 
> http://www.paganini.net/ask/
> 
> 	It's schtick is that it maintains a whitelist of people who may
> email you.  When an email from a new sender comes in, it holds the email
> for you, sends the person a simple confirmation messages (to which they
> simply hit Reply;Send), and then that person is added to your whitelist
> and their original messages is sent to you (and they are never ASKed
> again).  There's also some very practical regex stuff, some migration
> tools, and an ignorelist and blacklist (for situations like
> http://www.psychoexgirlfriend.com/).

This is the same as TMDA. I have evidence for you that it doesn't work. 
Case in point being direct from me: someone mailed me asking a technical 
question about one of my perl modules. I mailed him back a response, on 
my own free time. I got a TMDA bounce saying I had to confirm that I was 
a real person. Well frankly, sod that. I never replied. I never used his 
web page to confirm. I just ignored it and I'm sure he never got the 
reply to his question.

Now imagine extending that to corporations, where people would be even 
less inclined to add themselves to somebody's whitelist. TMDA doesn't scale.

Matt.