break unichr instead of fix ord?

Vlastimil Brom vlastimil.brom at gmail.com
Sat Aug 29 21:43:58 CEST 2009


2009/8/29  <rurpy at yahoo.com>:
> On 08/28/2009 02:12 AM, "Martin v. Löwis" wrote:
>
> So far, it seems not and that unichr/ord
> is a poster child for "purity beats practicality".
> --
> http://mail.python.org/mailman/listinfo/python-list
>

As Mark Tolonen pointed out earlier in this thread, in Python 3 the
practicality apparently beat purity in this aspect:

Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.

>>> goth_urus_1 = '\U0001033f'
>>> list(goth_urus_1)
['\ud800', '\udf3f']
>>> len(goth_urus_1)
2
>>> ord(goth_urus_1)
66367
>>> goth_urus_2 = chr(66367)
>>> len(goth_urus_2)
2
>>> import unicodedata
>>> unicodedata.name(goth_urus_1)
'GOTHIC LETTER URUS'
>>> goth_urus_3 = unicodedata.lookup("GOTHIC LETTER URUS")
>>> goth_urus_4 = "\N{GOTHIC LETTER URUS}"
>>> goth_urus_1 == goth_urus_2 == goth_urus_3 == goth_urus_4
True
>>>

As for the behaviour in python 2.x, it's probably good enough, that
the surrogates aren't prohibited and the eventually needed behaviour
can be easily added via custom functions.

vbr



More information about the Python-list mailing list