[Spambayes] help

Meyer, Tony T.A.Meyer at massey.ac.nz
Mon Aug 11 17:25:21 EDT 2003


> What i needed is the flow of execution, from the tokenizer.py,
> inside it, what it does, to the headers, the subject and the
> message body.

AFAIK, the only document that has this information is tokenizer.py
itself.  Usually tokenizing will be kicked off with the tokenize()
function, so just start there and see that it calls tokenize_headers()
and tokenize_body(), and so on.  There's a wealth of information in
tokenizer.py; it's definitely worth reading if you are interested in
this.

> I want to know where would the output go after 
> the tokenizer has done it's job.

Well, the tokens generator is returned to whatever function calls
tokenize().  This might be then pasted into a msg (like Outlook's "Show
Clues"), it might be used to train the classier, or any number of other
things.

> And i want to know which part
> is the writing of file of the tokenize strings, is it in the
> tokenizer or the classifier.

Do you mean to ask where the code that handles storing the database is?
It's in storage.py, which has a number of different classes that
subclass Classifier, providing persistence across sessions.

> and what is the command (yield).

As Skip suggested to Rociel, the tutorial is a good place to start:
<http://python.org/doc/2.3/tut/node11.html#SECTION0011100000000000000000
0>

=Tony Meyer



More information about the Spambayes mailing list