[Spambayes] filter misclassification

K. H. Gowranga gowranga at serc.iisc.ernet.in
Mon May 9 12:29:22 CEST 2005


Hello,

On Mon, 9 May 2005, Tony Meyer wrote:

> > **************************************************************
> > Combined Score: 6% (0.0565183)
> > **************************************************************
> > Internal ham score (*H*): 0.997835
> > Internal spam score (*S*): 0.110872
>
> I'm not sure what the problem is here.  Your original mail said that you
> needed the message to be scored as ham, and that's what's happening here
> (0.06 is pretty close to 0).  Is this message actually spam?
>
> > # ham trained on: 771
> > # spam trained on: 56
>
> Note that this is a reasonable imbalance, and we generally recommend that
> the database is kept approximately balanced.  See
> <http://entrian.com/sbwiki/TrainingIdeas> for more information.

Indeed I require the message to be scored as "ham". But despite my
retraining with just 15 messages each for "spam" and "ham" (to keep the
database balanced), I get the following message again delivered to my
"spam"  folder. Interestingly the message has no
"X-Spambayes-Classification"  header added but instead has
"X-Folder:Bulk".

Message

Return-Path: <bounce-21864-111417 at lyris.msfc.nasa.gov>
X-Original-To: gowranga at serc.iisc.ernet.in
Delivered-To: gowranga at serc.iisc.ernet.in
Received: from lyris.msfc.nasa.gov (www.spaceweather2.com [72.3.135.213])
        by serc.iisc.ernet.in (Postfix) with SMTP id 5320B18A5
        for <gowranga at serc.iisc.ernet.in>; Mon,  9 May 2005 11:43:14 +0530
(IST)
From: NASA Science News <snglist at snglist.msfc.nasa.gov>
To: NASA Science News <snglist at snglist.msfc.nasa.gov>
Subject: Mysterious Cancer
Date: Sun, 08 May 2005 23:27:56 -0500
MIME-Version: 1.0
Content-Type: multipart/alternative;
    boundary="MIMEBoundary3b7bc6f40fd9de9271dd11b6180be282"
List-Unsubscribe: <mailto:leave-snglist-111417C at lyris.msfc.nasa.gov>
Message-Id:

<LYRIS-111417-21864-2005.05.08-23.27.57--gowranga#serc.iisc.ernet.in at lyris.m
    sfc.nasa.gov>
X-Folder: Bulk
Parts/Attachments:
Parts/Attachments:
   1   OK     17 lines  Text (charset: ISO-8859-1)
   2 Shown    12 lines  Text (charset: ISO-8859-1)
----------------------------------------

<HTML><BODY>
NASA Science News for May 9, 2005<p>
Researchers agree that space radiation can cause cancer. They're just not
sure
how.<p>
FULL STORY at<p>
<a
href="http://science.nasa.gov/headlines/y2005/09may_mysteriouscancer.htm?list111
417">http://science.nasa.gov/headlines/y2005/09may_mysteriouscancer.htm?list1114
17</a><p>
The Science at NASA Podcast feed is available at <a
href="http://science.nasa.gov/podcast.xml.">http://science.nasa.gov/podcast.xml.

...

Kindly suggest what could have gone wrong. I get NASA news regularly as
messages, as I have subscribed to it.
Thanks.

-gowranga



More information about the Spambayes mailing list