[Python-Dev] Unicode filenames

Walter Dörwald walter@livinglogic.de
Mon, 10 Feb 2003 18:54:03 +0100


Just van Rossum wrote:

> Walter Dörwald wrote:
> 
> [...]
 >
>>But when the system default encoding (i.e. sys.getdefaultencoding())
>>and the file system encoding are different, I'd say the filename has
>>to be transcoded from the system default encoding to the filesystem
>>encoding before it is used.
> 
> In most places (probably all, uness there's a bug)
> Py_FileSystemDefaultEncoding only has relevance for unicode strings:
> 8-bit strings are passed to the underlying calls unaltered.

That's exactly the problem. Strings passed to open() must always be
UTF-8 encoded, so open() is essentially a unicode API. Passing 8bit
strings to that function should always go through that unicode API,
i.e. the should be treated as any other 8bit string in the unicode
context. This means it must be decoded from the default encoding.

> So the above
> traceback is the result of the _OS_ refusing to name a file "\xff",
> which is natural as this particular OS (OSX) uses UTF-8 as the native
> file system encoding and "\xff" is not valid UTF-8. (I was actually
> pleasantly surprised the OS actually _cares_ ;-)

Bye,
    Walter Dörwald