[Spambayes] was no subject (where can find documentation)
mpas1342 at yahoo.de
Mon Jun 12 09:31:28 CEST 2006
Von: Tim Peters [mailto:tim.peters at gmail.com]
Gesendet: Montag, 12. Juni 2006 08:21
Cc: spambayes at python.org
Betreff: Re: [Spambayes] was no subject (where can find documentation)
[mpas1342 at yahoo.de]
> I 'm not a Programmer and i have no experience with Python
> (a little with java)
> What should i do in this case, go along the whole code only to
> know how is the technique to create a token?
Yes. Tokenization is an algorithm, and you simply can't understand
the details without reading some code. The (very) short course is
that SpamBayes tokenizes by splitting on whitespace, and ignoring case
distinctions. Most of the time, but not all of the time.
> I dont know at least in which file i will find information about the
tokenizer.py contains all the tokenization code.
> Apart from that, i 'm not sure if i can understand it from only code,
> is better for people like me to see some Texts and
> simultaneously read code i think.
> I will be happy if you have such
> documentation and could send me :)
There is no such documentation, although as Tim Stone said:
Even if you're not a programmer, the comments are quite readable.
So try that. Feel free to ask questions if you get stuck. That _has_
to work better than continuing to ask for something that doesn't exist
Ok, i take a look on it later. But there is q Question regarding withespaces
Let consiider this sample:
I get an email with only this paraghraph on the body:
Sun is shining.
if you say because of wiithspaces there are only:
to be checked,i will ask what is with the substrings in sun and shining
and all combinations for shinig like
Because the spam email could contain at this paragraph spam words like this:
sunBuy is shinigViagra
i hope the sample is understandable:-)
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de
More information about the SpamBayes