[spambayes-bugs] [ spambayes-Feature Requests-817813 ] Consider bad
spelling a sign of spam
SourceForge.net
noreply at sourceforge.net
Mon Oct 13 17:43:01 EDT 2003
Feature Requests item #817813, was opened at 2003-10-05 07:39
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=817813&group_id=61702
Category: None
Group: None
>Status: Closed
Priority: 5
Submitted By: Mark Levison (mlevison)
Assigned to: Nobody/Anonymous (nobody)
Summary: Consider bad spelling a sign of spam
Initial Comment:
Add a spelling checker and reasonable sized dictionary.
If more than xx% of the message is misspelled (esp
the subject), consider it to be spam. Many emails have
gotten past Spam Bayes recently because their spelling
is like "bfuqclvfphz".
Also consider adding the words from any message not
marked as spam to the dictionary - that way it would
quickly learn proper names
----------------------------------------------------------------------
>Comment By: Tony Meyer (anadelonbrin)
Date: 2003-10-14 10:43
Message:
Logged In: YES
user_id=552329
Added both of these to the NEWTRICKS.txt file.
Note that a 'word' like 'bfuqclvfphz' is highly unlikely to be
used in the calculation of a message score. Any tokens not
seen before get a score of 0.5, and any tokens that score
between 0.4 and 0.6 don't get used for the calculation. So
unless you'd seen that word before, it wouldn't be used. If
you *had* seen it, surely it would be spam, and so a good
thing?
----------------------------------------------------------------------
Comment By: Jean-Marc Valin (jmvalin)
Date: 2003-10-10 20:12
Message:
Logged In: YES
user_id=1494
Actually, this could probably be achieved simply by
assigning a spam probability to words that aren't in the
database at all (assuming there's enough training data that
all real words are in the database).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=817813&group_id=61702
More information about the Spambayes-bugs
mailing list