How to convert between Japanese coding systems?
lie.1296 at gmail.com
Thu Feb 19 14:42:42 CET 2009
On Thu, 19 Feb 2009 15:28:12 +0900, Dietrich Bollmann wrote:
> Are there any functions in python to convert between different Japanese
> coding systems?
If I'm not mistaken, the email standard specifies that only 7-bit ASCII-
encoded bytes can be transported safely and reliably. The highest bit may
be stripped by email server or client. Thus, to transport non-ASCII data
safely, they cannot use "regular" encodings (e.g. utf-8, shift-jis, etc).
I'm not sure what the standard is for Japanese character, but it seems
that from reading the email header, the encoding used is a modified
UTF-8. Try checking python's email module, they might have something for
decoding 7-bit email-utf to 8-bit regular-utf or unicode string.
After decoding the email-utf to regular-utf or unicode string, converting
to other encoding should be trivial.
More information about the Python-list