> All locales return error messages in English. Only the Japanese uses
> Japanese which my regular expressions cannot handle at the moment.

What exactly are you expecting to happen, and what exactly happens

General advice with character sets in Python apply: always explicitly
declare the encoding of input, then decode to Unicode interally as early
as possible, and process all text that way. Only fix into an encoding
when it's time to output.

