[Spambayes] trained as ham, classified as spam

Gerrit Holl gerrit at nl.linux.org
Thu Oct 2 04:08:30 EDT 2003


Meyer, Tony wrote:
> Subject: RE: [Spambayes] trained as ham, classified as spam
> Date: Thu,  2 Oct 2003 01:40:48 +0200

> > I have trained the forwarded message as ham, and fed it to 
> > sb_filter.py afterwards. It is still classified as spam! Does 
> > this mean that looks like ham, but looks much more like spam? 
> > Is there a way to have this false positive being classified as ham?
> 
> More useful than the message itself would be the clues that it generates
> with your database.  You can get these via the web interface (the
> classify message box).

Ah, this is indeed useful. I did not know this was possible, I thought the
web interface was only for pop. I have been able to asnwer the question
for myself now: it was a message which, by my ISP, was unjustly classified
to contain a virus, and all messages which were justly classified as such
have been trained as spam. Actually quite logical, now I think about it.

> You should be able to see why it's being
> classified as spam (i.e. which tokens are making the difference), but
> you can post them here, too, and someone is bound to comment.  It would
> also help to know how much data you have fed to spambayes - note that it
> works best with roughly equal numbers of ham and spam, so if this isn't
> the case, then that might be the problem.

Well, it is roughly equal, although I don't know how roughly is roughly.
I trained with a little more spam than ham: probably approximately 60% spam.

Maybe it would be the best to restart with training, because although ham and
spam have been evenly spaced, a lot of trained spam was similar. I guess I
have to learn to work with spambayes better.

Thanks for your answer!

Gerrit.


-- 
204. If a freed man strike the body of another freed man, he shall pay
ten shekels in money.
        -- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Het zijn tijden om je zelf met politiek te bemoeien:
	http://www.sp.nl/



More information about the Spambayes mailing list