[Python-Dev] Getting started with GBayes testing

Brad Clements bkc@murkworks.com
Wed, 04 Sep 2002 18:39:01 -0400


I'm interested in contributing to GBayes ..

I'm thinking of trying word stemming and adding other types of token indicators. How 
can I contribute?

Btw, I have been saving up my spam for a year or so.. I have about 31,238 spam 
messages saved up now. These are categorized as spam based on my reading of the 
subject, or examining the body when in doubt. There are probably 10% dups in the 
corpus. Some of them have viruses, likely klez.

I'd like to replicate Tim's test rig so I can compare my results with existing ones. My 
spam isn't in mbox format, but I can convert it.. 

I'm particularly intersted in how to allow html only messages (reduce false positives). 
I'm getting a lot of personal mail in that format, unfortunately.

