Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

Terry Reedy tjreedy at
Fri Mar 29 19:06:40 CET 2013

On 3/28/2013 10:37 PM, Steven D'Aprano wrote:

> Under what circumstances will a string be created from a wchar_t string?
> How, and why, would such a string be created? Why would Python still
> support strings containing surrogates when it now has a nice, shiny,
> surrogate-free flexible representation?

I believe because surrogates are legal codepoints and users may put them 
in strings even though python does not (except for surrogate_escape 
error handling).

I believe some of the internal complexity comes from supporting the old 
C-api so as to not immediately invalidate existing extensions.

Terry Jan Reedy

More information about the Python-list mailing list