break unichr instead of fix ord?
Dieter Maurer
dieter at handshake.de
Sun Aug 30 00:54:21 EDT 2009
"Martin v. Löwis" <martin at v.loewis.de> writes on Fri, 28 Aug 2009 10:12:34 +0200:
> > The PEP says:
> > * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
> > length-one string.
> >
> > * unichr(i) for 2**16 <= i <= TOPCHAR will return a
> > length-one string on wide Python builds. On narrow
> > builds it will raise ValueError.
> > and
> > * ord() is always the inverse of unichr()
> >
> > which of course we know; that is the current behavior. But
> > there is no reason given for that behavior.
>
> Sure there is, right above the list:
>
> "Most things will behave identically in the wide and narrow worlds."
>
> That's the reason: scripts should work the same as much as possible
> in wide and narrow builds.
>
> What you propose would break the property "unichr(i) always returns
> a string of length one, if it returns anything at all".
But getting a "ValueError" in some builds (and not in others)
is rather worse than getting unicode strings of different length....
> > 1) Should surrogate pairs be disallowed on narrow builds?
> > That appears to have been answered in the negative and is
> > not relevant to my question.
>
> It is, as it does lead to inconsistencies between wide and narrow
> builds. OTOH, it also allows the same source code to work on both
> versions, so it also preserves the uniformity in a different way.
Do you not have the inconsistencies in any case?
... "ValueError" in some builds and not in others ...
More information about the Python-list
mailing list