[Python-Dev] Unicode filenames

Just van Rossum just@letterror.com
Mon, 10 Feb 2003 19:52:15 +0100


Walter D=F6rwald wrote:

> >>But when the system default encoding (i.e. sys.getdefaultencoding())
> >>and the file system encoding are different, I'd say the filename has
> >>to be transcoded from the system default encoding to the filesystem
> >>encoding before it is used.
> >=20
> > In most places (probably all, uness there's a bug)
> > Py_FileSystemDefaultEncoding only has relevance for unicode strings:
> > 8-bit strings are passed to the underlying calls unaltered.
>=20
> That's exactly the problem. Strings passed to open() must always be
> UTF-8 encoded, so open() is essentially a unicode API.

(On platforms on which utf-8 is the file system encoding, yes.)

> Passing 8bit
> strings to that function should always go through that unicode API,
> i.e. the should be treated as any other 8bit string in the unicode
> context. This means it must be decoded from the default encoding.

Well, that's not how it currently works and changing that will break
code. I'm not sure about the rationale of the current semantics, but I
assume it has to do with compatibility with non-unicode-aware code.

Just