unichr() question
Martin v. Löwis
martin at v.loewis.de
Thu Oct 16 17:31:47 EDT 2003
"Ezequiel, Justin" <j.ezequiel at spitech.com> writes:
> How do I convert strings such as '1D4AA' to unicode without using
> eval()? Alternatively, how can I break down the value 119978L into
> 55349 and 56490?
I strongly advise that you don't. Even though an UCS-2 Python build
has some capbilities to represent non-BMP characters, you should use
these facilities only if you know what you are doing, and if you
absolutely need it.
To convert UCS-4 into a pair of two UTF-16 codepoints, use
>>> def ucs4toucs2(codepoint):
... hi,lo=divmod(codepoint-0x10000,0x400)
... return 0xd800+hi,0xdc00+lo
...
>>> ucs4toucs2(119978L)
(55349L, 56490L)
Regards,
Martin
More information about the Python-list
mailing list