[Spambayes] A couple of small tokenizer experiments.
Tue Nov 12 02:06:05 2002
>> In personal email "subjectcharset:unknown" shows up a lot for some
>> reason (but only in spam).
> Hm. Dunno about that - Barry might know under what circumstances
> email package gives 'unknown' as a charset. I can't see how that
> could happen.
Easy <wink>: it's my personal email, and the string UNKNOWN is what
*Outlook* delivers. I think it actually says UNKNOWN as it came in off the
I get my share of
thingies but I also get a monsters like these:
That one came in to firstname.lastname@example.org on Friday. Perhaps they've learned
that Greg will reject a msg just for using an unloved charset, but I doubt
In fact, I see that 'subjectcharset:unknown' is now the single strongest
spam word in my entire mistaken-driven (and tiny) training corpus:
More information about the Spambayes