[Python-Dev] Unicode strings as filenames
M.-A. Lemburg
mal@lemburg.com
Fri, 04 Jan 2002 18:06:53 +0100
Jack Jansen wrote:
>
> Off on a slight tangent:
> On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles
> this fine and so does open(). The OS does all the hard work for you:
> it knows that some mounted disks may be in other 8-bit encodings (such
> as MacRoman or MacJapanese for old mac disks, or probably latin-1 for NFS
> filesystems, or god-knows-what for SMB mounted disks) and handles the
> conversion.
That's good news.
> But in Python (unix-Python we're talking here, not MacPython),
> unicode(filename) fails, because site.encoding is "ascii".
>
> Would it be safe to set site.encoding to utf8 on Mac OS X by default?
I'd rather suggest to use UTF-8 as default encoding in the
subsystem layer I was talking about.
Making UTF-8 the default Python system encoding would have many other
consequences -- and you'd probably lose a great deal of portability
since UTF-8 conversion (nearly) always will succeed while ASCII can
easily fail on other systems which use e.g. Latin-1 as native
encoding.
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting: http://www.egenix.com/
Python Software: http://www.egenix.com/files/python/