[Spambayes] how spambayes handles image-only spams

Bill Yerazunis wsy at merl.com
Tue Sep 9 13:23:17 EDT 2003


   From: "Ryan Malayter" <rmalayter at bai.org>

   From: Bill Yerazunis [mailto:wsy at merl.com] 

   > Well, what happens in CRM114 is not that 
   > the HTML causes confusion, it does get 
   > factored in, but when you have a nearly 
   > 1:1 ratio in the hits, it basically doesn't 
   > make any difference to the end value.

   Bill, how does CRM-114 handle a typical <IMG SRC=HTTP tag? Could you
   post a sample of the (multi-word) tokens generated? 

   This might help me understand your approach and if it is the right one
   to take with my test tweaks to the tokenizer for Spambayes.

OK, give me a sample line with the actual data, and I'll give
you the tokens it will generate.

    -Bill Yerazunis



More information about the Spambayes mailing list