[spambayes-bugs] [ spambayes-Feature Requests-922840 ] Score
multipart/alternative separately
SourceForge.net
noreply at sourceforge.net
Mon May 3 20:47:28 EDT 2004
Feature Requests item #922840, was opened at 2004-03-24 16:27
Message generated for change (Comment added) made by leobru
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=922840&group_id=61702
Category: None
Group: None
Status: Open
Priority: 5
Submitted By: Leonid (leobru)
Assigned to: Nobody/Anonymous (nobody)
Summary: Score multipart/alternative separately
Initial Comment:
The amount of spam with multipart/alternative content where
the plain text is a piece of prose or such, and the
HTML is a UCE, is growing. My proposal is:
- compute separate scores as if the text/plain part was
empty, and as if the text/html part was empty
- to compute the final score, use min(plain_hamscore,
html_hamscore) and max(plain_spamscore, html_spamscore)
because any disparity is by itself a spam indicator.
----------------------------------------------------------------------
>Comment By: Leonid (leobru)
Date: 2004-05-03 17:47
Message:
Logged In: YES
user_id=790676
A 75 Kb long spam message has been observed that scored an
exact 0.50 because of that technique. The text/plain part
was an enormous list of space-separated random words that
happened to include enough "hammy" words in my database to
saturate the default 150 word cutoff before the "spammy"
ones would have started to prevail. Unless measures are
taken, the spammers will learn the trick quickly.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=922840&group_id=61702
More information about the Spambayes-bugs
mailing list