[Spambayes] Still getting horrible results from SpamBayes - Anyadvice?

Adam Lasnik alasnik at smilezone.com
Sun Sep 14 17:59:19 EDT 2003


Thanks for all your prompt help!

Here are some clarifications.

- There are about 1,000 each of spam and ham in my SpamBayes database

- By false positives, I mean that SpamBayes treats the mails as ham, not
unsure.

- I realize the Nigerian scams use different language, but with these
and many of the other spams SpamBayes is missing, there is still quite a
bit of consistent language.  And, for instance, the fake "I saw your
profile!" spams use almost the exact same language each time, and 99% of
the time, the listed e-mail address is from Hotmail (when barely any of
my friends use Hotmail).  These should register as something other than
0% or 4%.

- Changing my cutoffs wouldn't help.  This isn't a matter of SpamBayes
ALMOST getting it right (e.g., 78% 'spam'), but rather, not even coming
close.

- I've included a typical "I saw your profile" spam below.  Note that it
now says spam score of 85%, but this is AFTER I reclassified it.  Would
it be more helpful for me to submit Clues to this list from e-mail
before I reclassify it?

Thanks again for your help!

Regards,
Adam
http://smilezone.com/ -- Everything to make you smile :)  

-----

Spam Score: 85% (0.8516)


word                                spamprob         #ham  #spam
'*H*'                               0.134717            -      -
'*S*'                               0.837917            -      -
"i'm"                               0.0796865         330     33
'old'                               0.197553           81     23
'found'                             0.204999          164     49
'years'                             0.228916          154     53
'saw'                               0.230668           55     19
'pictures'                          0.232825           60     21
'back'                              0.264985          215     90
'name'                              0.299411          123     61
'email'                             0.304498          446    227
'from:addr:yahoo.com'               0.313371          134     71
'skip:i 10'                         0.314242          216    115
'want'                              0.321543          256    141
'and'                               0.366785          947    638
'female'                            0.382766           14     10
'your'                              0.399498          809    626
'header:Received:2'                 0.694116          198    523
'skip:[ 40'                         0.707537            2      6
'byee...'                           0.828196            0      1
'email name:willie_342_morning'     0.828196            0      1
'from:addr:girlacm'                 0.828196            0      1
'subject:...#!!'                    0.828196            0      1
'to:addr:imo'                       0.828196            0      1
'anna...^'                          0.896278            0      2
'subject:hello...'                  0.896278            0      2
'message-id:@rogue.bloghosts.com'   0.922656            7    100
'^^^^'                              0.942138            0      4
'whatever..'                        0.969292            0      8
'x-mailer:microsoft outlook express 6.00.2462.0000' 0.972516
0      9
Message Stream:


Status: U
Return-Path: <girlaCm at yahoo.com>
Received: from rogue.bloghosts.com ([64.246.52.17])
	by albert.mail.atl.earthlink.net (Earthlink Mail Service) with
ESMTP id
	19YAl84jH3Nl3qU0
	for <alasnik at mindspring.com>; Sun, 14 Sep 2003 13:07:10 -0400
(EDT)
Received: from [217.84.167.152] (helo=sender1003)
	by rogue.bloghosts.com with esmtp (Exim 4.20) id
19yaL5-0004VC-IS
	for thatadamguy at smilezone.com; Sun, 14 Sep 2003 13:07:08 -0400
From: girlaCm at yahoo.com
Subject: hello...#!!
Content-Type: text/plain
Content-Transfer-Encoding: text/plain
Date: Sun, 15 Sep 2002 01:03:48 -0700
X-Priority: 3
X-Library: Indy 10.00.14-B
To: imO
X-Mailer: Microsoft Outlook Express 6.00.2462.0000
Message-Id: <E19yaL5-0004VC-IS at rogue.bloghosts.com>
X-AntiAbuse: This header was added to track abuse,
	please include it with any abuse report
X-AntiAbuse: Primary Hostname - rogue.bloghosts.com
X-AntiAbuse: Original Domain - smilezone.com
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - yahoo.com




Hello,  I'm 22 years old female and my name is Anna...^  I saw your
profile on the net and found to be interesting.. email me back at
Willie_342_morning at hotmail.com if you want to exchange pictures or
whatever.. 
^^^^
Hugs, byee...








[U/.QWL,b61NpB)t.2qH=?Buj\T?n.R;gBb:ce6\s0]

Message Tokens:

49 unique tokens

'^^^^'
'and'
'anna...^'
'back'
'byee...'
'cc:none'
'content-type:text/plain'
'email'
'email addr:hotmail.com'
'email name:willie_342_morning'
'exchange'
'female'
'found'
'from:addr:girlacm'
'from:addr:yahoo.com'
'from:no real name:2**0'
'header:Date:1'
'header:From:1'
'header:Message-Id:1'
'header:Received:2'
'header:Return-Path:1'
'header:Subject:1'
'header:To:1'
'hello,'
'hugs,'
"i'm"
'message-id:@rogue.bloghosts.com'
'name'
'net'
'old'
'pictures'
'profile'
'reply-to:none'
'saw'
'sender:none'
'skip:[ 40'
'skip:i 10'
'subject:...#!!'
'subject:hello...'
'the'
'to:2**0'
'to:addr:imo'
'to:no real name:2**0'
'want'
'whatever..'
'x-mailer:microsoft outlook express 6.00.2462.0000'
'years'
'you'
'your'




More information about the Spambayes mailing list