[Python-Dev] Multilingual programming article on the Red Hat Developer blog
"Martin v. Löwis"
martin at v.loewis.de
Wed Sep 17 14:06:06 CEST 2014
Am 17.09.14 10:56, schrieb Steven D'Aprano:
> On Wed, Sep 17, 2014 at 09:21:56AM +0900, Stephen J. Turnbull wrote:
>
>> Guido's mantra is something like "Python's str doesn't contain
>> characters or even code points[1], it contains code units."
>
> But is that true?
It used to be true, and stopped being so with PEP 393. In particular,
Python 3.2 and before would expose UTF-16 in the narrow build, so the
elements of a string would be code units. Since Python 3.3, the
surrogate code points are not longer interpreted as UTF-16 code units.
Regards,
Martin
More information about the Python-Dev
mailing list