[Spambayes] SpamBayes to Handle Embedded Images

FreeMJ@HotPop.com FreeMJ at HotPop.com
Mon Oct 3 01:43:36 CEST 2005

OCR is probably the only sure-fire way to nail this scourge.  As far as
being resource intensive, like most other people with always-on broadband
access now, my e-mail just trickles in a little at a time.  And many/most
PCs are powerful enough to stream video now-a-days; they really shouldn't
have a problem with it being added as a feature.  It's a lot more disruptive
to manage these by hand, if you ask me.  And an OCR feature could allow
itself to be disabled, if it ended up being a performance problem for
It's gotta be done.  Now that these spammers have found an easy way to trick
these engines to be digging through meaningless text, there'll be no slowing
them without OCR.  I'm getting more and more of this style of Spam.  Easy to
install/use programs like SpamBayes have to keep up with the times, or
they'll die on the vine.  Years ago, when we mostly exchanged text-based
e-mail, it wasn't an issue.  But now, nearly all of the e-mail I receive is
HTML; and lots of it has images.
I'm ONLY using SpamBayes with Outlook 2003 (at home, where I'm having all
the trouble).  I love the easy button-based re-training!  And I don't really
care for the idea of having to add, train, and administer another layer.
Other than a miraculous OCR feature showing up in SpamBayes soon, I'm out of
ideas for a simple way of managing this type of mail on my home PC.  (Very


From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of Herb Martin
Sent: Sunday, October 02, 2005 12:43 PM
To: spambayes at python.org
Subject: Re: [Spambayes] SpamBayes to Handle Embedded Images

Back in April, Tony Meyer posted that he was receiving a lot of image-based
I too am having nothing but trouble with embedded images:
- Daily adds for fake Rolex watches
- Daily stock tips
- TONS of drugs for sale.
This style of Spam contains an image at the top, followed by a bunch of
totally unrelated text that has been copied from some kind of random
composition.  I have very large Spam & Ham folders, that I've successfully
trained SpamBayes with.  It's only these image-based adverts that sneak by

Mostly my SpamBayes catches ALL of these when anything gets this far...
 Something really needs to be done about this type of Spam within SpamBayes.
Are any other Spam engines able to handle this stuff, by scanning the image
for text, or something?

Sure, there are others (as well a SpamBayes if you just keep training EVERY
ONE of them) but most of the others are either commercial (i.e., cost money)
OR they run on the Server (SpamAssassin, greylistd, and other filters.)

There has been talk about filters which would explicitly do OCR or some
other type of image content detection but I don't (personally) know of any
that are working/available/effective right now.

Such would also likely be "resource (CPU) intensive".

FWIW, greylisting on the server knocks down practically all of this junk and
SpamAssassin catches the rest.

The VERY occasional item that slips through our server is caught by
SpamBayes.  (Defense in depth is our key to ZERO spam -- with practically
everything REJECTED, not bounced, at the server during SMTP connect time.)
And some of us DO WISH to get graphical email -- picture of my grand kid(s)
frequently arrive this way.

Herb Martin



From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of FreeMJ at hotpop.com
Sent: Sunday, October 02, 2005 1:53 PM
To: spambayes at python.org
Subject: [Spambayes] SpamBayes to Handle Embedded Images


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes/attachments/20051002/546f3be4/attachment.html

More information about the SpamBayes mailing list