Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

Oh, ok. I had assumed Windows just uses a fixed encoding without the problem of misencoded filenames.
It's the other way 'round: On Windows, Unicode file names are the natural choice, and byte strings have limitations. In a sense, Windows got it right - but then, they started later. Unix missed the opportunity of declaring that all file APIs are UTF-8 (except for Plan-9 and OS X, neither being "true" Unix). Regards, Martin

On 30-Sep-2008, at 23:42 , Martin v. Löwis wrote:
It's the other way 'round: On Windows, Unicode file names are the natural choice, and byte strings have limitations. In a sense, Windows got it right - but then, they started later. Unix missed the opportunity of declaring that all file APIs are UTF-8 (except for Plan-9 and OS X, neither being "true" Unix).
How does windows (and Python on windows) handle NFC versus NFD issues? Can I have two files called "ümlaut.txt", one in NFD and one NFC form? And are both of those representable on the Python side (i.e. can they both be returned from listdir() and passed to open())? CIf I compare these two filenames, do they compare differently? -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

How does windows (and Python on windows) handle NFC versus NFD issues?
That's left to the application.
Can I have two files called "ümlaut.txt", one in NFD and one NFC form?
Yes, you can. It sounds confusing, but only in a theoretical way. You never have combining characters on Windows (at least, I don't). The keyboard input defaults to NFC, and users normally don't type file names, anyways, except when creating the files - later, they just use the mouse to indicate what file they want to act on.
And are both of those representable on the Python side (i.e. can they both be returned from listdir() and passed to open())?
Certainly!
CIf I compare these two filenames, do they compare differently?
Certainly! Regards, Martin

On 1-Oct-2008, at 00:32 , Martin v. Löwis wrote:
How does windows (and Python on windows) handle NFC versus NFD issues?
That's left to the application.
Can I have two files called "ümlaut.txt", one in NFD and one NFC form?
Yes, you can. It sounds confusing, but only in a theoretical way. You never have combining characters on Windows (at least, I don't). The keyboard input defaults to NFC, and users normally don't type file names, anyways, except when creating the files - later, they just use the mouse to indicate what file they want to act on.
And are both of those representable on the Python side (i.e. can they both be returned from listdir() and passed to open())?
Certainly!
CIf I compare these two filenames, do they compare differently?
Certainly!
Actually, that all sounds pretty non-confusing to me:-) So, normal users will always have the one form, and if by chance they get the other form they can still use the file. Also from Python, even when doing listdir() and then open(), everything will work just as expected. That there are two files that have a similar visual representation is not too bad, the same happens with ellipses versus dot-dot-dot and many other cases. Which means the only problem area left is unix filesystems (whether on Linux or mounted remotely on MacOS or whatever), where filenames are really byte strings with only / and nul illegal. -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman
participants (2)
-
"Martin v. Löwis"
-
Jack Jansen