[Spambayes] more on exceptions while parsing messages
Sean 'Shaleh' Perry
shalehperry at comcast.net
Sun Jul 13 04:03:14 EDT 2003
received another mail that caused an exception in the spambayes engine.
This one had a 'registered mark', i.e. the R with a circle around it, in the
Subject: Claim your Free Sony® Headset
it is represented by the ascii (kinda) 0xAE which causes the python mail
parser to complain because 0xAE > 0x80.
There is no header in the message which tries to change the charset. I wonder
if spambayes should respond to this specific exception (ascii out of range)
by assuming the message is in a charset where the value in question is valid.
I believe the Python library is correct in complaining about the value not
being within the message's charset and letting the application react
Mail like the one mentioned above is now the only spam I see. Variations
include Korean and other Asian charset spam that fail to properly define
their charset as well as miscellaneous other uses of ascii values over 128.
More information about the Spambayes