[spambayes-dev] Re: 1070 spam, 1 false positive
Greg Ward
greg at python.org
Fri Jun 20 23:06:30 EDT 2003
On 20 June 2003, Martijn Pieters said:
> Sorry to be a party pooper, but there were more false positives; I rescued
> 2 earlier this week. The following message was also marked as spam:
Darn. But there's good news:
> From: "Tom Deprez" <tom at aragne.com>
> To: <europython at python.org>,
> <europython-announce at python.org>,
> <zope-announce at zope.org>,
> <python-announce at python.org>,
> <eurozope at comlounge.net>
> Subject: EuroPython news
> Date: Mon, 16 Jun 2003 14:43:45 +0200
This one was rejected fairly early in the Spambayes regime. I just
scored it with the current training set, and it scored < 0.1.
Also, for some reason the envelope recipient of that message was *just*
zope-announce at zope.org, in spite of what the "To" header says. I bet if
that message had really been sent to europython at python.org, it would
have been flagged UNSURE. No way to tell now, though, since I don't
have the training DB from Monday.
> From: "Morten W. Petersen" <morten at nidelven-it.no>
> To: zope-dev at zope.org
> Subject: Renaming a product
> X-Mailer: NeoMail 1.25
> X-IPAddress: 80.202.17.36
> MIME-Version: 1.0
> Content-Type: text/plain; charset=iso-8859-1
> Message-Id: <E19SKzG-0002Fj-00 at dns.activemedia.no>
> Date: Tue, 17 Jun 2003 20:15:18 +0200
> X-Virus-Scanned: by AMaViS 0.3.12
> X-AntiAbuse: This header was added to track abuse, please include it with
> any
> +abuse report
> X-AntiAbuse: Primary Hostname - dns.activemedia.no
> X-AntiAbuse: Original Domain - zope.org
> X-AntiAbuse: Originator/Caller UID/GID - [32940 1441] / [32940 1441]
> X-AntiAbuse: Sender Address Domain - dns.activemedia.no
> X-Spam-Status: SPAM (lists-zope 0.854)
And this one was treated very badly because of the X-AntiAbuse headers;
here's how it scores with the current DB:
Y 0.869 save/ham/cur/19SKzr-0000NR-00:2,S
'*H*': 0.060
'*S*': 0.797
'all,': 0.065
'message-id:skip:d 10': 0.065
'zodb': 0.065
'does': 0.086
'product,': 0.092
'thanks,': 0.092
'(with': 0.155
'instances': 0.155
'python,': 0.155
'return-path:skip:d 10': 0.155
'date:0200': 0.173
'date:Tue': 0.191
'anyone': 0.230
'content-type:text/plain': 0.266
'received:62': 0.303
'product': 0.379
'know': 0.380
'header:Received:3': 0.388
'to:no real name:2**0': 0.610
'date:Jun': 0.627
'after': 0.635
'charset:iso-8859-1': 0.641
'work': 0.645
'proto:http': 0.656
'stored': 0.666
'new': 0.715
'to:addr:zope-dev': 0.789
'to:dev': 0.789
'url:www': 0.800
'number:': 0.811
'x-antiabuse:Address': 0.811
'x-antiabuse:Caller': 0.811
'x-antiabuse:Domain': 0.811
'x-antiabuse:GID': 0.811
'x-antiabuse:Hostname': 0.811
'x-antiabuse:Original': 0.811
'x-antiabuse:Originator': 0.811
'x-antiabuse:Primary': 0.811
'x-antiabuse:Sender': 0.811
'x-antiabuse:This': 0.811
'x-antiabuse:UID': 0.811
'x-antiabuse:abuse': 0.811
'x-antiabuse:added': 0.811
'x-antiabuse:any': 0.811
'x-antiabuse:header': 0.811
'x-antiabuse:include': 0.811
'x-antiabuse:please': 0.811
'x-antiabuse:report': 0.811
'x-antiabuse:track': 0.811
'x-antiabuse:was': 0.811
'x-antiabuse:with': 0.811
'x-antiabuse:zope.org': 0.811
'phone': 0.971
But if I add x-antiabuse to basic_header_skip, it comes through fine:
N 0.085 save/ham/cur/19SKzr-0000NR-00:2,S
'*H*': 0.877
'*S*': 0.047
'all,': 0.065
'message-id:skip:d 10': 0.065
'zodb': 0.065
'does': 0.086
'product,': 0.092
'thanks,': 0.092
'(with': 0.155
'instances': 0.155
'python,': 0.155
'return-path:skip:d 10': 0.155
'date:0200': 0.173
'date:Tue': 0.191
'anyone': 0.230
'content-type:text/plain': 0.266
'received:62': 0.303
'product': 0.379
'know': 0.380
'header:Received:3': 0.388
'to:no real name:2**0': 0.610
'date:Jun': 0.627
'after': 0.635
'charset:iso-8859-1': 0.641
'work': 0.645
'proto:http': 0.656
'stored': 0.666
'new': 0.715
'to:addr:zope-dev': 0.789
'to:dev': 0.789
'url:www': 0.800
'number:': 0.811
'phone': 0.971
I'm building training DBs with x-antiabuse excluded now, to see how it
helps/hurts. Another lively Friday night chez Greg...
Greg
--
Greg Ward <gward at python.net> http://www.gerg.ca/
Never put off till tomorrow what you can put off till the day after tomorrow.
More information about the spambayes-dev
mailing list