[Python-Dev] Multilingual programming article on the Red Hat Developer blog

"Martin v. Löwis" martin at v.loewis.de
Wed Sep 17 14:06:06 CEST 2014

Am 17.09.14 10:56, schrieb Steven D'Aprano:
> On Wed, Sep 17, 2014 at 09:21:56AM +0900, Stephen J. Turnbull wrote:
>> Guido's mantra is something like "Python's str doesn't contain
>> characters or even code points[1], it contains code units."
> But is that true?

It used to be true, and stopped being so with PEP 393. In particular,
Python 3.2 and before would expose UTF-16 in the narrow build, so the
elements of a string would be code units. Since Python 3.3, the
surrogate code points are not longer interpreted as UTF-16 code units.


More information about the Python-Dev mailing list