[Spambayes] (no subject)
tameyer at ihug.co.nz
Fri May 26 00:58:17 CEST 2006
>>> The word lengths in Dutch are somewhere between those of
>>> English and German. Is this a "configurable"?
>> Not trivially, but it's not too hard either. Look toward the
>> bottom of
>> spambayes/tokenizer.py where there are a couple comparisons of n
>> to 3. I
>> can't quote you the correct chapter and verse because I'm using a
>> of tokenizer.py modified in just that region and SourceForge
>> appears to be
>> on-the-blink at the moment. It should be fairly easy to understand.
> OK, I'll unleash my vi-fu and give it a try.
Please let us know if it does appear to help. It would be trivial to
make it an option (the opposite end - skip_max_word_size - already
is) if that would be something that helps users for whom English
isn't their main email language.
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.
More information about the SpamBayes