[Spambayes] "Lindsey Carter": Re: [Zope-Annce] New zope.org
development
Tim@mail.powweb.com
Tim@mail.powweb.com
Mon Nov 4 20:19:53 2002
If we were to run a similar bayesian analysis of the pages that spam
links point to, and used that information as another set of clues for
classification, would that have made a difference in this instance,
and in general? By that I mean, once a mail has been classified as
spam, we could look at the pages that the page points to and make a
similar wordlist type classification. This classification could be
used in Unsure instances by looking at the pages the mail points to
and then applying the webpage wordlist bayes classification to it. If
it's a probable spam-pointed-to-page, then the mail is probably
spam... at least that could weigh (heavily) into the statistics for
the words in the mail itself....
- TimS
11/4/2002 2:07:23 PM, Tim Peters <tim.one@comcast.net> wrote:
>[Guido]
>> This smells like a clever spam,
>
>Click on the link to Lindsey's webpage if you have lingering doubts.
>
>> disguised as a zope-announce message I sent. SA scored it -2.8:
>>
>> X-Spam-Status: No, hits=-2.8 required=5.0
>> tests=BODY_PYTHON_ZOPE,CLICK_BELOW,FROM_BIGISP,FROM_ENDS_IN_NUMS,Q
>> UOTED_EMAIL_TEXT,SPAM_PHRASE_03_05,SUBJ_PYTHON_ZOPE
>>
>> Wonder if SB will do any better...
>
>Absolutely: SB never gives negative scores <wink>.
>
>Barry once floated the idea of trying to strip quoted text in the
tokenizer,
>but nobody (AFAIK) tried that. Short of something like that, I
expect the
>best you can hope for is that this will end up in your Unsure
category. I
>believe that QUOTED_EMAIL_TEXT means SA gave it a *ham* boost for
containing
>quoted email. The "Re:" in the subject line is a clue of that sort
for SB
>too, along with various words starting w/ ">".
>
>
>_______________________________________________
>Spambayes mailing list
>Spambayes@python.org
>http://mail.python.org/mailman/listinfo/spambayes
>
>
- Tim
www.fourstonesExpressions.com
More information about the Spambayes
mailing list