[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue
James Y Knight
foom at fuhm.net
Tue Sep 30 23:59:10 CEST 2008
On Sep 30, 2008, at 5:40 PM, Martin v. Löwis wrote:
>>> On Windows, we might reject bytes filenames for all file
>>> operations: open(),
>>> unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError)
>>
>> Since I've seen no objections to this yet: please no. If we offer a
>> "lower-level" bytes filename API, it should work for all platforms.
>
> Unfortunately, it can't. You cannot represent all possible file names
> in a byte string in Windows (just as you can't do so in a Unicode
> string on Unix).
As you mention in the parenthetical below, of course it can.
> So using byte strings on Windows would work for some files, but fail
> for others. In particular, listdir might give you a list of file names
> which you then can't open/stat/recurse into.
>
> (of course, you could use UTF-8 as the file system encoding on
> Windows,
> but then you will have to rewrite a lot of C code first)
Yes! If there is a byte-string access method for Windows, pretty
please make it decode from UTF-8 internally and call the Unicode
version of the Windows APIs. The non-unicode windows APIs are pretty
much just broken -- Ideally, Python should never be calling those.
But, I still don't like the idea of propagating the "sometimes a
string, sometimes bytes" APIs...One or the other, please. Either
always strings (if and only if a method for assuring decoding always
succeeds), or always bytes.
James
More information about the Python-Dev
mailing list