[spambayes-dev] Dibbler.py error in training
sean darcy
seandarcy at hotmail.com
Tue Apr 6 14:22:33 EDT 2004
----Original Message Follows----
From: "Kenny Pitt" <kennypitt at hotmail.com>
To: "'sean darcy'"
<seandarcy at hotmail.com>,<skip at pobox.com>
CC: <spambayes-dev at python.org>
Subject: RE: [spambayes-dev] Dibbler.py error in training
Date: Tue, 6 Apr 2004 09:13:39 -0400
...................................
>Oops, looks like I misread the original error message. The fix I put in
>is probably a useful safeguard, but not the one that was causing the
>problem.
>
>In looking more closely, though, something seems a little odd here. The
>offending object that is coming back None appears to be the msg[header]
>reference. If I'm not mistaken, that means that either the Subject: or
>To: header is missing entirely from the message, which is very unusual.
It's not that unusual for the Subject header to be missing. Looking over
past emails, I've found some "ham" posts that had no subject. In any event,
some of the posts to be trained do have no Subject - all spam.
Here's an example from "tokens" on the untrained message page:
Tokens for: (none) (15)
Word Probability Times in ham Times in spam
content-type:text/plain 0.288326 1576 556
from:addr:qziwpklwit - 0 0
from:addr:musician.org 0.844828 0 1
from:no real name:2**0 0.186886 825 165
to:none 0.878691 2 14
cc:none 0.351951 979 463
sender:none 0.410456 978 593
reply-to:none 0.271479 746 242
x-mailer:none 0.417812 832 520
message-id:@mta13.srv.hcvlny.cv.net 0.844828 0 1
header:Date:1 0.500287 1742 1519
header:Received:3 0.77726 215 654
header:Message-id:1 0.907877 144 1238
header:From:1 0.500718 1739 1519
header:Return-path:1 0.940104 95 1302
Here's the mesage source:
Return-path: <qziwpklwit at musician.org>
Received: from mta13.srv.hcvlny.cv.net (mta13.srv.hcvlny.cv.net
[167.206.5.82])
by mstr9.srv.hcvlny.cv.net
(iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003))
with ESMTP id <0HVC00G0PB4QME at mstr9.srv.hcvlny.cv.net>; Mon,
29 Mar 2004 08:36:26 -0500 (EST)
Received: from f94006.upc-f.chello.nl (f94006.upc-f.chello.nl [80.56.94.6])
by mta13.srv.hcvlny.cv.net
(iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003))
with SMTP id <0HVC00ISEAU5TL at mta13.srv.hcvlny.cv.net>; Mon,
29 Mar 2004 08:34:03 -0500 (EST)
Received: from 123.224.24.65 by 80.56.94.6 with qdtrhun [1
Date: Mon, 29 Mar 2004 08:34:03 -0500 (EST)
Date-warning: Date header was inserted by mta13.srv.hcvlny.cv.net
From: qziwpklwit at musician.org
Message-id: <0HVC00IM1B0CTL at mta13.srv.hcvlny.cv.net>
Content-transfer-encoding: 7BIT
X-Spambayes-Classification: unsure
X-Spambayes-Spam-Probability: 0.84
X-Spambayes-Level: ********
X-Spambayes-MailId: 1080858684-6
>Could you, by chance, attach a copy of the message that is causing the
>error?
The untrained message page has about 60 messages. How do I know which one is
the problem?
>A copy of it should appear as a file in one of the cache
>directories below the directory containing your training database, or
>you could just view the message source from Review Messages and
>copy-and-paste it.
You've lost me. Here's my spambayes data directory:
ls
bayescustomize.ini _pop3proxy.log pop3proxy-spam-cache
bayescustomize.ini~ pop3proxy.log-1 pop3proxy-unknown-cache
bayescustomize.ini.bak pop3proxy.log-evolution spambayes.messageinfo.db
hammie.db pop3proxy.log-evolution~ start.info
pop3proxy-ham-cache pop3proxy.log-mozilla
When I grep for the odd "From" name I get nothing:
grep -R qziwpklwit *
I'm looking for spam in all the wrong places.
>--
>Kenny Pitt
sean
_________________________________________________________________
Tax headache? MSN Money provides relief with tax tips, tools, IRS forms and
more! http://moneycentral.msn.com/tax/workshop/welcome.asp
More information about the spambayes-dev
mailing list