[Spambayes] unknown encoding us-ascii on Message.asTokens(), but

Brad Clements bkc at murkworks.com
Thu Jul 31 18:22:51 EDT 2003


I use the asTokens method of a message to get a token list, and that sometimes fails 
with

Traceback (most recent call last):
  File "mkwsbayes\interface.pyc", line 87, in sb_train
  File "mkwsbayes\UserContext.pyc", line 98, in train
  File "spambayes\message.pyc", line 187, in asTokens
  File "spambayes\message.pyc", line 199, in as_string
  File "email\Message.pyc", line 113, in as_string
  File "email\Generator.pyc", line 103, in flatten
  File "email\Generator.pyc", line 138, in _write
  File "email\Generator.pyc", line 172, in _write_headers
  File "email\Generator.pyc", line 44, in _is8bitstring
LookupError: unknown encoding: us-ascii

(note, I'm running this from a py2exe .zip so maybe I haven't included encodings or 
something.. ?)

However if I don't use the asTokens() method on a message (instead I use 
hammie.score function, it works fine on the same message that asTokens fails on)

why is there difference between asTokens() and not using tokens?

Since I may need to unlearn this message, I'd prefer to tokenize only once. That's why I 
use asTokens. 

Any ideas on how to fix this?



-- 
Brad Clements,                bkc at murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
http://www.wecanstopspam.org/                   AOL-IM: BKClements




More information about the Spambayes mailing list