[Spambayes] How to Display tokenized ham/spam scores?

Joerg Beyer job at webde-ag.de
Tue Aug 19 12:57:31 EDT 2003


Jake wrote:
> Hello there,
> 
> How can i display the actual ham/spam scoring for words/tokens
> ble)?    --- the ones that get written into the hammie.db for
> classification.

for the dbm version of the stored you can do this:
open the dbm file, iterate over the keys (which is the token)
for each key extract a python object, which is a pickled
object (for most cases a 2-tuple (ham and spam count
for the key, sometimes a 3-tuple, but I dont know yet why)

So you can extract the ham/spam count for each token (roughly
a token is a word from a mail plus special words, like how
many entries have been in the to: and cc: filed of the header).

> Am interested on how the algorithm works exactly.

read the source, it is very annotated whith comments that
say, why something is done.

	hope this helps
	Joerg




More information about the Spambayes mailing list