[spambayes-dev] Design Doc for Tokenizer

dk7x at berkeley.edu dk7x at berkeley.edu
Mon Oct 30 01:08:42 CET 2006


Hey all,

I'm currently doing a UC Berkeley research project. We would like to
understand what interactions the tokenizer has with the different modules.
Is there any documentation available that describes the different modules?
We are interested in what the email representation is after email is
tokenized and going into the learner and classifier. In addition, we would
like to isolate the tokenizer. Any help would be appreciated. Thanks in
advance for your response.

Kai Xia



More information about the spambayes-dev mailing list