"Martin v. Löwis" <martin@v.loewis.de>:
I think the people defending the "Unix file names are just bytes" side often miss an important detail: displaying file names to the user, and allowing the user to enter file names.
The user interface is a real issue and needs to be addressed. It is separate from the OS interface, though.
A script that just needs to traverse a directory tree and look at files by certain criteria can easily do so with not worrying about a text interpretation of the file names.
A single system often has file names that have been encoded with different schemes. Only today, I have had to deal with the JIS character table (<URL: http://i.msdn.microsoft.com/cc305152.932%28en-us,MSDN.10%29.gif>) -- you will notice that it doesn't have a backslash character. A coworker uses ISO-8859-1. I use UTF-8. UTF-8, of course, will refuse to deal with some byte sequences. My point is that the poor programmer cannot ignore the possibility of "funny" character sets. If Python tried to protect the programmer from that possibility, the result might be even more intractable: how to act on a file with an non-UTF-8 filename if you are unable to express it as a text string? Marko