[Spambayes] SpamBayes for Olde Worlde environments

Stuart Moors moors at helanta.sh
Thu Dec 14 19:01:10 CET 2006


Much as I hate to admit it, where I live at the moment is,
technologically speaking, several tens of eons behind the accepted
world-wide median.
There are good reasons for this, but they are irrelevant.

Suffice it to say that I still depend on dial-up to get my email.

So, while Bayesian spam techniques are OK/Good/Fine/Brilliant (or, at
least, impressive), they are little use if I still have to pay for the
download of spam before I decide their classification.

That is:
SPAM can be classified as such and discarded for two reasons:
A. to avoid having your Inbox cluttered
B. to avoid having to pay for the download of the crap

It seems to me that the major thrust of the SpamBayes (and similar)
initiatives is Reason A. Which is a pity, because I am at least as
interested in Reason B.

If spam can be so classified by my email service provider, BEFORE I ever
get to see it, it woul d be far more useful…..

….unless, of course, the algorithm has been too aggressive and discarded
an email that was bona fide - a heinous crime in my book.

So, how about - for the email service providers, a facility that:

A. filters according to Bayesian principles,
B. retains spam at the server, either indefinitely until the space runs
out or until a per-message expiry date occurs
C. alerts the user to a SINGLE message (say once per week) of filtered
out stats and summaries

The stats could also be analysed. A series of messages from the same
source may very well be ham, trying desparately to be accepted (despite
the fact that the sender's first name is Nigeria!). This way, I can
tweak a white-list to allow certain sources immunity from filtering,
even if standard analysis would be definite about spamminess.

The retention at the server would mean that I could log on to web-mail
access to retrieve messages that I discovered to be ham.



On signing off, I'd like to say that I am very appreciative of the
efforts being brought to bear on this scourge of the digital age. Yet, I
sometimes wonder If there's not some better way (some authentication
protocol mechanism, perhaps) to identify spam, other than analysing even
personally-trained word frequencies in arriving mail. In 99% of the
cases, mail arriving at my inbox is either from a sender I have in my
address book, or is in response to an email I have sent myself. This
must cover the majority of cases for others too. The remaining 1% must
be unwanted or unsolicited and could be handled in more human-oriented
ways (so that they cannot be easily be automated) in order that I can
decide if they are bona fide or not. (e.g. a message autotically
returned to the sender saying that unsolicited mail has been detected
and will only be delivered if an answer to a question is given correctly
(and the answer is the entry of a character string equivalent to an
obfuscated image of the character string). This method requires no
knowledge of my interests (which may include stock investments, sexing
of chickens, penis dimensions, drug therapies or whatever) or of the
style or language of my bona fide correspondents, solicited or
otherwise.

I understand that the foregoing is a very "personal email" view of the
world. If I operated an e-commerce site, I might have a different view,
but even then, email could be discarded unless human input had been seen
to have been made (a product number, a phrase indicating a query or even
the same technique noted above), so automated spam can be discarded.

Hoping that the word frequency in this email is such that you will read
it…

-- 
-------------------------------------
Stuart Moors
Alarm Forest, St.Helena
Tel:  (00290) 3255  Email: moors at helanta.sh
-------------------------------------


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.

 


This e-mail has been scanned for viruses by the Cable & Wireless St. Helena e-mail security system - powered by McAfee.


This e-mail has been scanned for viruses by the Cable & Wireless St. Helena e-mail security system - powered by McAfee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes/attachments/20061214/2a5e66e5/attachment.html 


More information about the SpamBayes mailing list