[Mailman-Users] UnicodeDecodeError during Archive Obscuring

Mon Nov 26 00:21:00 CET 2007

Mark Sapiro writes:
 > Tokio Kikuchi wrote:
 > >
 > >>   File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 579, in
 > >> as_text
 > >>     atmark = unicode(_(' at '), Utils.GetCharSet(self._lang))
 > >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 1:
 > >> ordinal not in range(128)
 > <snip>
 > >
 > >It is not ' at ' itself but it's translation which caused this error.
 > >It's strange though the language set immediately before should work for
 > >its unicode conversion.
 > 
 > 
 > It's even stranger than that. the codec is 'ascii'. The only language
 > with a charset of 'ascii' is 'en' and if the language is 'en', where
 > does the '\xd0' come from?

Almost certainly, the language is not 'en', the language is 'unknown'.
The last time there was a spate of these problems, I took a quick look
at the code.  It appears to me that the Email module takes the MIME
spec seriously, and applies the defaults to that case, ie, language =
'en' and charset = 'us-ascii'.  IOW, it tests that headers are ASCII
by decoding them as ASCII.  Boom!  Since there's no try specific to
that attempt, you end up with the default catchall try.

I didn't have enough time to really understand what was going on, so
this is really a wild-ass guess.  Hope it helps, anyway.

Happy Thanksgiving (and roudou-kansha-hi)!