[Spambayes] Looking for code to modify spambayes for 5-gramtokenization
Tony Meyer
tameyer at ihug.co.nz
Mon Jun 13 05:14:05 CEST 2005
> I'm looking to modify spambayes to use 5-grams rather than
> split-on-whitespace. We have a few Asian customers and the default
> spambayes setup has not been very effective for them. So, we want to
> test with 5-grams and see if we can improve the effectiveness.
You might want to look at (fi you haven't already):
[ 824651 ] Multibyte (CJK etc.) message support
<https://sourceforge.net/tracker/?func=detail&atid=498105&aid=824651&group_i
d=61702>
=Tony.Meyer
--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.
More information about the Spambayes
mailing list