[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Terry Reedy tjreedy at udel.edu
Sat Apr 25 03:08:16 CEST 2009

Toshio Kuratomi wrote:
> Glenn Linderman wrote:
>> On approximately 4/24/2009 11:40 AM, came the following characters from
>> And so my encoding (1) doesn't alter the data stream for any valid
>> Windows file name, and where the naivest of users reside (2) doesn't
>> alter the data stream for any Posix file name that was encoded as UTF-8
>> sequences and doesn't contain ? characters in the file name [I perceive
>> the use of ? in file names to be rare on Posix, because of experience,
>> and because of the other problems caused by such use] (3) doesn't
>> introduce data puns within applications that are correctly coded to know
>> the encoding occurs.  The encoding technique in the PEP not only can
>> produce data puns, thus not being reversible, it provides no reliable
>> mechanism to know that this has occurred.
> Uhm....  Not arguing with your goals but '?' is unfortunately reasonably
> easy to get into a filename.  For instance, I've had to download a lot
> of scratch built packages from our buildsystem recently.  Scratch builds
> have url's with query strings in them so::

Is NUL \0 allowed in POSIX file names?  If not, could that be used as an 
escape char.  If it is not legal, then custom translated strings that 
escape in the wild would raise a red flag as soon as something else 
tried to use them.

More information about the Python-Dev mailing list