On Wed, Oct 1, 2008 at 12:25 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Simon Cross writes:
I still find this line of reasoning a bit worrying. Imagine an end user application like a music player. The user discovers that he can't see some .mp3 or .ogg file from the music player that is visibile is the file manager. I would expect him to file a bug on the music player. If the bug was closed with "fix the filename" I imagine the user would respond with "but other programs can access it just fine".
And the user would very likely be *wrong*. The file manager is displaying it, but in the nature of things file managers *don't access files*, they access *directories*. The files they pass to other apps to access.
Exactly the same reasoning applies to files in a directory with an odd name.
I'm not unhappy with the solution Victor is proposing, but I imagine that when I start coding projects in 3.0 I'll default to the bytes versions of the filename methods and use b"path".decode(sys.getfilesystemencoding(), "replace") if I need to get Unicode.
But now the user will file a bug because in the file opening dialog they can't *read* their Chinese file names on their USB key because they are appearing in (system encoding) Cyrillic. Do you begin to see the nature of the Catch-22 here?
I don't expect the user to be very sympathetic when you tell her to fix the filenames, but it's not as easy as you would think to get this right.
a) There is some chance that at least ASCII characters will be displayed correctly if getfilesystemencoding() is similar to the encoding used and corrupted filenames will display correctly except for corrupted characters. b) The user will at least be able to access the file. It's a more graceful degredation of functionality than not being able to work with the file at all. Schiavo Simon