[Python-Dev] Unicode filenames

Guido van Rossum guido@python.org
Sun, 09 Feb 2003 08:20:30 -0500


> MacOSX fully supports unicode filenames (utf-8 is used throughout), and
> I'm tempted to set Py_FileSystemDefaultEncoding to "utf8" for OSX. Jack
> pointed me to a long thread about unicode filenames that took place on
> python-dev last year, but I can't deduce from it whether there are any
> disadvantages of setting Py_FileSystemDefaultEncoding.
> 
> Setting it seems to work wonderful. However, I'm a bit surprised that
> os.listdir() doesn't return unicode strings. Is that because it would
> break too much code?

I think that's shallow: the special-casing of unicode_file_names()
only exists in the Windows branch of the code.

> BTW. if I try to create a file with an 8-bit filename which is _not_
> valid utf-8, I get a strange error:
> 
>   >>> f = open("\xff", "w")
>   Traceback (most recent call last):
>     File "<stdin>", line 1, in ?
>   IOError: invalid mode: w
>   >>> 
> 
> This exception is thrown when errno is EINVAL, which apparently can also
> mean that the filename arg is bad. Not sure if we can fix this.

I think we should (maybe we already do) check the mode string more
carefully ourselves, and not rely on undocumented correlations between
error returns.

--Guido van Rossum (home page: http://www.python.org/~guido/)