[spambayes-bugs] [ spambayes-Feature Requests-817813 ] Consider bad spelling a sign of spam

SourceForge.net noreply at sourceforge.net
Mon Oct 13 17:43:01 EDT 2003


Feature Requests item #817813, was opened at 2003-10-05 07:39
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=817813&group_id=61702

Category: None
Group: None
>Status: Closed
Priority: 5
Submitted By: Mark Levison (mlevison)
Assigned to: Nobody/Anonymous (nobody)
Summary: Consider bad spelling a sign of spam

Initial Comment:
Add a spelling checker and reasonable sized dictionary.
 If more  than xx% of the message is misspelled (esp
the subject), consider it to be spam.  Many emails have
gotten past Spam Bayes recently because their spelling
is like "bfuqclvfphz".

Also consider adding the words from any message not
marked as spam to the dictionary - that way it would
quickly learn proper names

----------------------------------------------------------------------

>Comment By: Tony Meyer (anadelonbrin)
Date: 2003-10-14 10:43

Message:
Logged In: YES 
user_id=552329

Added both of these to the NEWTRICKS.txt file.

Note that a 'word' like 'bfuqclvfphz' is highly unlikely to be 
used in the calculation of a message score.  Any tokens not 
seen before get a score of 0.5, and any tokens that score 
between 0.4 and 0.6 don't get used for the calculation.  So 
unless you'd seen that word before, it wouldn't be used.  If 
you *had* seen it, surely it would be spam, and so a good 
thing?

----------------------------------------------------------------------

Comment By: Jean-Marc Valin (jmvalin)
Date: 2003-10-10 20:12

Message:
Logged In: YES 
user_id=1494

Actually, this could probably be achieved simply by
assigning a spam probability to words that aren't in the
database at all (assuming there's enough training data that
all real words are in the database).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=817813&group_id=61702



More information about the Spambayes-bugs mailing list