[Python-Dev] Unicode strings as filenames

M.-A. Lemburg mal@lemburg.com
Fri, 04 Jan 2002 18:06:53 +0100


Jack Jansen wrote:
> 
> Off on a slight tangent:
> On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles
> this fine and so does open(). The OS does all the hard work for you:
> it knows that some mounted disks may be in other 8-bit encodings (such
> as MacRoman or MacJapanese for old mac disks, or probably latin-1 for NFS
> filesystems, or god-knows-what for SMB mounted disks) and handles the
> conversion.

That's good news.
 
> But in Python (unix-Python we're talking here, not MacPython),
> unicode(filename) fails, because site.encoding is "ascii".
> 
> Would it be safe to set site.encoding to utf8 on Mac OS X by default?

I'd rather suggest to use UTF-8 as default encoding in the
subsystem layer I was talking about. 

Making UTF-8 the default Python system encoding would have many other 
consequences -- and you'd probably lose a great deal of portability 
since UTF-8 conversion (nearly) always will succeed while ASCII can 
easily fail on other systems which use e.g. Latin-1 as native 
encoding.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/