[Spambayes] how spambayes handles image-only spams

Bill Yerazunis wsy at merl.com
Sun Sep 7 21:12:55 EDT 2003


   From: "Robert K. Coe" <bob at 1776.com>

   I wonder if you may be overlooking something that could skew your
   statistics. My experience has been that when I create an HTML
   message, Outlook actually sends it as a multi-part MIME construct
   incorporating both HTML and plain-text forms of the message. If the
   recipient reads the message with an HTML-capable email reader,
   he'll see the HTML form of the message; otherwise he'll see the
   plain-text form. If you're collecting your statistics with a
   plain-text mail reader, or if you're looking only at the plain-text
   version in a multi-part message, you may be understating the actual
   use of HTML in messages sent to you.

Actually, no.

I read email with Emacs, and I get the _WHOLE_ text (headers and
everything, multipart mimes, all that) just as it was recieved on the
SMTP port.  Additionally, I get a postprocessed section (not an
attachment) with all of the base64's expanded, <--interupptus-->
comments removed, etc.

That's what I feed back into the learning cycle; so I even get 
things you probably don't get, like KOI-8 russian text and text
that had been so sliced up by spammus interruptus that you can't
read it.

Do you do any reassembly, or is there any chance that you are 
not getting the ASCII text if there is any?

   In fact, if someone knows how to get Outlook to stop sending a
   plain-text version of HTML messages, I'd like to hear about it. Now
   that almost everybody can read HTML messages, I think the
   plain-text version is superfluous.

No, you have it the other way 'round.

The HTML version is superfluous, the plain text is all you need.  :-)

    -Bill Yerazunis



More information about the Spambayes mailing list