[Spambayes] Installation error + Script error

Tony Meyer tameyer at ihug.co.nz
Tue Jan 31 21:13:56 CET 2006


>>>> eventually, i need to combine this python script, get out the  
>>>> number of ham and spam tokens, and pass this to a matlab code  
>>>> which is converted into C++. i heard that it's actually possible  
>>>> to "embed" C++ into python - have you done it before?
>>>
>>> Yes, but it would probably be much simpler to just call a Python  
>>> script and read the output.  Or if you're wanting to run C++ code  
>>> in Python, then to call a compiled C++ application and read the  
>>> output.
>
> meaning I'll do a python script to count the number of tokens and  
> give the output? The only option I saw in options.py is to give the  
> tokens used as the evidence. which one should i look into in order  
> to find out the number of spam and ham tokens that were actually used?

I meant you could use your own script.  Something like (untested):

---
import sys

from spambayes.storage import open_storage
from spambayes.tokenizer import tokenize

# Open existing token database.
db = open_storage("db.name", "pickle", "r") # or "dbm", or "zodb", etc

# Read the message from stdin, and tokenize and classify it.
message_text = sys.stdin.read()
all_tokens = tokenize(message_text)
classification, clues = db.spamprob(all_tokens, True)

# Separate out the clues into ham & spam.
ham_clues = [clue for (clue, prob) in clues if prob <= 0.5]
spam_clues = [clue for (clue, prob) in clues if prob > 0.5]

# Print out the results to stdout.
print len(ham_clues)
print len(spam_clues)
---

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.




More information about the SpamBayes mailing list