Unicode filenames and os.path.* functions

Jason Orendorff jason at jorendorff.com
Fri Jan 4 05:12:24 EST 2002


>     I'm using Python 2.1.1 and have to work with unicode filenames on
> Windows 2000. The functions in os.path like 
> os.path.exists(Unicode-Filename), os.path.getsize(...), etc. 
> don't support this 
> ("UnicodeError: ASCII encoding error: ordinal not in range(128)").
> What is the reason for that? [...]

The reason: Python has to convert your nice Unicode string object
into a yucky byte array, because that's what the Windows _stati64()
function requires.  Python should use the MBCS encoding to create
that byte array; that's what _stati64() is expecting.  But instead
Python 2.1 uses the default ASCII encoding, and so it complains
that your string isn't all ASCII characters.  (sigh)

Python 2.2 fixes this.

For 2.1, you can make it work by doing the MBCS encoding yourself:

  # Workaround for Python 2.1
  mbcsFilename = unicodeFilename.encode('mbcs')
  if os.path.isfile(mbcsFilename):
      print os.path.getsize(mbcsFilename)

I didn't test that but it should work.  :-)

## Jason Orendorff    http://www.jorendorff.com/






More information about the Python-list mailing list