[Spambayes] is the database empty

Amedee Van Gasse amedee at amedee.be
Sat Jun 10 15:16:55 CEST 2006


On Sat, June 10, 2006 14:18, yahoo.de said:
>
>>The most efficient way to fill your database, is to let spambayes train
>> on
>>unsures and on mistakes.
>
> what do you mean with train on mistake?
> how could i train the SB to recognize emails with advertistment images for
> some product and so on?
> let see the email has no text, but onla an image in the  body!
> (i know there are image scanner software for this purpose, but what could
> be
> done in such cases)

Image spam is indeed a problem.
Otoh, in my personal experience it's only a problem in theory.
In practice there are enough other spammy characteristics in such emails.

I don't know about image scanners specifically for spam detection, but I
think it's possible to feed emails trough such image scanners before they
are fed to spambayes.

I already send my emails trough some conversion filters before they are
spambayesed. For example I use mimencode to preconvert some special MIME
formats (like used in Asian languages) into 8-bit format.
I can imagine one could make an ocr program that converts images to text
(if possible) and attaches the text to the email, which is subsequently
fed to spambayes. That way, spambayes virtually "reads" the image just
like a human does.

Actually, you suggest something interesting. I'm going to try a few things
and iff they work, I'll post it on the list.

Amedee

PS: please don't do "reply all", reply to the list.



More information about the SpamBayes mailing list