break unichr instead of fix ord?
rurpy at yahoo.com
rurpy at yahoo.com
Sat Aug 29 20:16:30 EDT 2009
On 08/29/2009 01:43 PM, Vlastimil Brom wrote:
> > 2009/8/29<rurpy at yahoo.com>:
>> >> On 08/28/2009 02:12 AM, "Martin v. Löwis" wrote:
>> >>
>> >> So far, it seems not and that unichr/ord
>> >> is a poster child for "purity beats practicality".
>> >> --
>> >> http://mail.python.org/mailman/listinfo/python-list
>> >>
> >
> > As Mark Tolonen pointed out earlier in this thread, in Python 3 the
> > practicality apparently beat purity in this aspect:
> >
> > Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit
> > (Intel)] on win32
> > Type "copyright", "credits" or "license()" for more information.
> >
>>>> >>>> goth_urus_1 = '\U0001033f'
>>>> >>>> list(goth_urus_1)
> > ['\ud800', '\udf3f']
>>>> >>>> len(goth_urus_1)
> > 2
>>>> >>>> ord(goth_urus_1)
> > 66367
>>>> >>>> goth_urus_2 = chr(66367)
>>>> >>>> len(goth_urus_2)
> > 2
>>>> >>>> import unicodedata
>>>> >>>> unicodedata.name(goth_urus_1)
> > 'GOTHIC LETTER URUS'
>>>> >>>> goth_urus_3 = unicodedata.lookup("GOTHIC LETTER URUS")
>>>> >>>> goth_urus_4 = "\N{GOTHIC LETTER URUS}"
>>>> >>>> goth_urus_1 == goth_urus_2 == goth_urus_3 == goth_urus_4
> > True
>>>> >>>>
Yes, that certainly seems like much more sensible behavior.
> > As for the behaviour in python 2.x, it's probably good enough, that
> > the surrogates aren't prohibited and the eventually needed behaviour
> > can be easily added via custom functions.
Yes, I agree that given the current behavior is well documented
and further, is fixed in python 3, it can't be changed.
I would a nit though with "can be easily added via custom
functions."
I don't think that is a good criterion for rejection of functionality
from the library because it is not sufficient; their are many
functions
in the library that fail that test. I think the criterion should
be more like a ratio: (how often needed) / (ease of writing).
[where "ease" is not just the line count but also the obviousness
to someone who is not a python expert yet.]
And I would also dispute that the generalized unichr/ord functions
are "easily" added. When I ran into the TypeError in ord(), I
thought "surrogate pairs" were something used in sex therapy. :-)
It took a lot of reading and research before I was able to write
a generalized ord() function.
More information about the Python-list
mailing list