[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

James Y Knight foom at fuhm.net
Tue Sep 30 23:59:10 CEST 2008


On Sep 30, 2008, at 5:40 PM, Martin v. Löwis wrote:
>>> On Windows, we might reject bytes filenames for all file  
>>> operations: open(),
>>> unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError)
>>
>> Since I've seen no objections to this yet: please no. If we offer a
>> "lower-level" bytes filename API, it should work for all platforms.
>
> Unfortunately, it can't. You cannot represent all possible file names
> in a byte string in Windows (just as you can't do so in a Unicode
> string on Unix).

As you mention in the parenthetical below, of course it can.

> So using byte strings on Windows would work for some files, but fail
> for others. In particular, listdir might give you a list of file names
> which you then can't open/stat/recurse into.
>
> (of course, you could use UTF-8 as the file system encoding on  
> Windows,
> but then you will have to rewrite a lot of C code first)

Yes! If there is a byte-string access method for Windows, pretty  
please make it decode from UTF-8 internally and call the Unicode  
version of the Windows APIs. The non-unicode windows APIs are pretty  
much just broken -- Ideally, Python should never be calling those.

But, I still don't like the idea of propagating the "sometimes a  
string, sometimes bytes" APIs...One or the other, please. Either  
always strings (if and only if a method for assuring decoding always  
succeeds), or always bytes.

James


More information about the Python-Dev mailing list