convert \uXXXX to native character set?
Bengt Richter
bokr at oz.net
Tue Dec 21 17:59:11 EST 2004
On Mon, 20 Dec 2004 12:49:39 +0200, Miki Tebeka <miki.tebeka at zoran.com> wrote:
>Hello Joe,
>
>> Is there any library to convert HTML page with \uXXXX encoded text to
>> native character set, e.g. BIG5.
>Try: help("".decode)
>
But the OP wants to en-code, I think. E.g. (I don't know what Chinese for ichi is ;-)
>>> ichi = u'\u4e00'
>>> ichi
u'\u4e00'
>>> ichi.encode('big5')
'\xa4@'
UIAM that created two str bytes constituting big5 code for
the single horizontal stroke glyph whose unicode code is u'\u4e00'
>>> list(ichi.encode('big5'))
['\xa4', '@']
going from big5-encoded str back to unicode then takes de-coding:
>>> '\xa4@'.decode('big5')
u'\u4e00'
Regards,
Bengt Richter
More information about the Python-list
mailing list