[Spambayes] full o' spaces

bill parducci bill at parducci.net
Fri Mar 7 11:21:06 EST 2003


welcome to MEME mail! :o) You

i have been working on some ideas on how to attack this off an on for the last few months, but it is very difficult because [the]{mind}(is)quite|g00d`at~separating+the\message'fr0m_the^TEXT. it is this work that prompted my initial query into what is being done with tokenization on this list.

if it would help, i can send/post a few sample messages that i have been using to test my work. i have also come with a crude mechainsm for trying to work around it. hasn't been tested and needs a lot of work (it is written in <blush/> vb). anyway, if anyone is interested i can show what i have come up with so far.

b

Skip Montanaro wrote:
> I just received a message (attached) in which every word in the body was
> space-separated.  There were thus no clues at all in the body and the clues
> in the header weren't enough to pull it out of the unsure classification.
> I'm working on a tokenizer patch.
> 
> Skip
1




More information about the Spambayes mailing list