Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008/9/30 Glenn Linderman <v+python@g.nevcal.com>:
So the problem is that a Unicode file system interface can't deal with non-UTF-8 byte streams as file names.
So it seems there are four suggested approaches, all of which have aspects that are inconvenient.
Let's not forget what happens when a non-UTF-8 file name is read from a file or written to a file, under the assumption that the filename is written to the file directly (which probably breaks for filenames containing newlines or such).
4) Use of bytes APIs on FS interfaces. This seems to be the "solution" adopted by Posix that creates the "problem" encountered by Unicode-native applications. It is cumbersome to deal with within applications that attempt to display the names. What do Posix-style "open file" dialog boxes do in this case?
http://library.gnome.org/devel/glib/stable/glib-Character-Set-Conversion.htm... I used to observe three different ways to display such filenames within gedit (including %xx and \xx escapes), but now it is consistent, probably because it switched to using the above function everywhere: $ touch $'abc\xffz' $ gedit The Open dialog shows: abc�z (invalid encoding) When the file is open, the window title and the tab title show: abc�z and the same is in recent file list. It has a bug: it appends " (invalid encoding)" even if the filename contains a correctly encoded U+FFFD character. Nautilus has the same behavior and the same bug because this is a design bug of that function which does not allow to tell whether the conversion was successful. A filename containing a newline is sometimes displayed in two lines, and sometimes with a U+000A character from a fallback font (hex character number in a box). -- Marcin Kowalczyk qrczak@knm.org.pl http://qrnik.knm.org.pl/~qrczak/
participants (1)
-
Marcin 'Qrczak' Kowalczyk