unichr() question
Ezequiel, Justin
j.ezequiel at spitech.com
Wed Nov 5 01:21:48 EST 2003
From: martin at v.loewis.de
I strongly advise that you don't. Even though an UCS-2 Python build has some capbilities to represent non-BMP characters, you should use these facilities only if you know what you are doing, and if you absolutely need it.
>>> def ucs4toucs2(codepoint):
... hi,lo=divmod(codepoint-0x10000,0x400)
... return 0xd800+hi,0xdc00+lo
...
Dear Martin,
Thanks for taking time to reply and for the function.
Sorry for responding so late (I get the mail digest and currently have 390 digests unread).
I am converting XML files with entities to utf-8 using a lookup table:
⏞ 0FE37
⏟ 0FE38
<sc>O</sc> 1D4AA
I have no idea what I am doing but I sure think that I absolutely need it.
Can you explain more on non-BMP characters (and Python's capabilities to represent these) and how it applies (if it does) to my needs?
More information about the Python-list
mailing list