[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Glenn Linderman v+python at g.nevcal.com
Wed Apr 29 11:38:32 CEST 2009


On approximately 4/29/2009 12:38 AM, came the following characters from 
the keyboard of Baptiste Carvello:
> Glenn Linderman a écrit :
>>
>> 3. When an undecodable byte 0xPQ is found, decode to the escape 
>> codepoint, followed by codepoint U+01PQ, where P and Q are hex digits.
>>
> 
> The problem with this strategy is: paths are often sliced, so your 2 
> codepoints could get separated. The good thing with the PEP's strategy 
> is that 1 character stays 1 character.
> 
> Baptiste


Except for half-surrogates that are in the file names already, which get 
converted to 3 characters.


-- 
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking


More information about the Python-Dev mailing list