[Python-Dev] PEP 393 Summer of Code Project

Thu Aug 25 13:54:36 CEST 2011

On Thu, Aug 25, 2011 at 7:57 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Am 25.08.2011 11:39, schrieb Stephen J. Turnbull:
>> I'm simply saying that the current
>> implementation of strings, as improved by PEP 393, can not be said to
>> be conforming.
>
> I continue to disagree. The Unicode standard deliberately allows
> Python's behavior as conforming.

I'd actually put it slightly differently: it seems to me that Python,
in and of itself, can neither conform to nor violate that part of the
standard, since conformance depends on how the *application* processes
the data.

However, we can make it harder or easier for applications to be
conformant. UCS2 builds make it harder, since some code points have to
be represented as code units internally. UCS4 builds and future PEP
393 builds (which should exhibit current UCS4 build semantics at the
Python layer) make it easier, since the internal representation
consistently uses code points, with code units only appearing as part
of the encoding and decoding process.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia