[Spambayes] Help Im Lost!

Meyer, Tony T.A.Meyer at massey.ac.nz
Sat Aug 16 20:59:28 EDT 2003


> I have a question in the tokenizer.py.
>  
> If the email contains images(Binary Block):
>  
> 1) How does the tokenizer handle it? 
> 2) Does the tokenizer, tokenize the Binary block? 
> 3) What will the tokenizer do in case of a Binary Block? 
> 4) How can it determine that it is a Binary Block?
>  
> I appreciate it very much!

The relevant bits in tokenizer.py are those that deal with "octet
parts".  (If you just search for "binary" in tokenizer.py, you'll find
the first section).

>From the comments: "there's no point decoding binary blobs (like
images)".  You should also read the comments at the start of the
tokenize_body() function.  Basically the first few characters of the
octet stream are turned into tokens.  The size is controlled by an
option.

Googling for "site:mail.python.org tokenize image spambayes" will bring
up some relevant messages about this from the archives, too.

=Tony Meyer



More information about the Spambayes mailing list