[Spambayes] "Lindsey Carter": Re: [Zope-Annce] New zope.org development

Mon Nov 4 20:19:53 2002

If we were to run a similar bayesian analysis of the pages that spam 
links point to, and used that information as another set of clues for 
classification, would that have made a difference in this instance, 
and in general?  By that I mean, once a mail has been classified as 
spam, we could look at the pages that the page points to and make a 
similar wordlist type classification.  This classification could be 
used in Unsure instances by looking at the pages the mail points to 
and then applying the webpage wordlist bayes classification to it.  If 
it's a probable spam-pointed-to-page, then the mail is probably 
spam...  at least that could weigh (heavily) into the statistics for 
the words in the mail itself....

- TimS

11/4/2002 2:07:23 PM, Tim Peters <tim.one@comcast.net> wrote:

>[Guido]
>> This smells like a clever spam,
>
>Click on the link to Lindsey's webpage if you have lingering doubts.
>
>> disguised as a zope-announce message I sent.  SA scored it -2.8:
>>
>>     X-Spam-Status: No, hits=-2.8 required=5.0
>> tests=BODY_PYTHON_ZOPE,CLICK_BELOW,FROM_BIGISP,FROM_ENDS_IN_NUMS,Q
>> UOTED_EMAIL_TEXT,SPAM_PHRASE_03_05,SUBJ_PYTHON_ZOPE
>>
>> Wonder if SB will do any better...
>
>Absolutely:  SB never gives negative scores <wink>.
>
>Barry once floated the idea of trying to strip quoted text in the 
tokenizer,
>but nobody (AFAIK) tried that.  Short of something like that, I 
expect the
>best you can hope for is that this will end up in your Unsure 
category.  I
>believe that QUOTED_EMAIL_TEXT means SA gave it a *ham* boost for 
containing
>quoted email.  The "Re:" in the subject line is a clue of that sort 
for SB
>too, along with various words starting w/ ">".
>
>
>_______________________________________________
>Spambayes mailing list
>Spambayes@python.org
>http://mail.python.org/mailman/listinfo/spambayes
>
>
- Tim
www.fourstonesExpressions.com