[Spambayes] UnicodeDecodeError: ordinal not in range(128)

Peter Barker peterb at zeta.org.au
Thu Nov 2 23:44:25 CET 2006

I am using sb_bnfilter.py version 1.1a3 on linux (actually CVS version from 
Aug 21), and am getting an increasing number of spam messages with 8-bit 
characters in messages labelled as 7-bit. These cause sb_bnfilter.py to give 
an error such as:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb0 in position 1: 
ordinal not in range(128)
and the message is not classified.

I set the option replace_nonascii_chars: True in .spambayesrc, and some 
characters no longer caused problems (as they were replaced with ?). 

However any 8-bit character in the Subject: header still caused problems, also 
the character 0xb0 somewhere in messages. I used hexdump on the message to 
look for the reported 8-bit character. 
I could find the reported character in the message (or headers), except in the 
case of 0xb0, which did not appear to be anywhere in the message when dumped 
with hexdump.

I have attached a message which reports the 0xb0 problem.
