[Python-Dev] Unicode strings as filenames

Neil Hodgson nhodgson@bigpond.net.au
Fri, 4 Jan 2002 09:51:03 +1100


Martin:

>   In any case, passing Unicode objects to open() works just fine, atleast
>   as long as they can be encoded in the ANSI code page. If you want to
>   open a Chinese file name on a Russian Windows installation, you lose.

   I want to be able to open all files on my English W2K install and can
with many applications even if some have Chinese names and some have
Russian. The big advance W2K made over NT was to only have one real version
of the OS instead of multiple language versions. There is a system default
language as well as local defaults but with just a few clicks my machine can
be used as a Japanese machine although as the keyboard keys don't grow
Japanese characters, it is a bit harder to use. You do buy localised
versions of W2K and XP but they differ in packagng and defaults - the
underlying code is identical which was not the case for NT or 9x.

   Locales are a really poor choice for people who need to operate in
multiple languages and much software is moving to allowing concurrent use of
multiple languages through the use of Unicode. The term
'multinationalization' (m18n) is sometimes used in Japan to talk about
systems that try to avoid restrictions on character set and language.

> >    There may also be techniques for doing this on Windows 9x as the file
> > system stores Unicode file names but I have never looked into this.
>
> To my knowledge, VFAT32 doesn't - only NTFS does (which is not
> available on W9x).

   I have a file called u"C:\\z\u0439\u0446.html" on my W2K FAT partition
which displays correctly in the explorer and can be opened in, for example,
notepad.

   This leads to the interesting situation of being able to see a file using
glob but not then use it:
>>> import glob
>>> glob.glob("C:\\*.html")
['C:\\l2.html', 'C:\\list.html', 'C:\\m4.html', 'C:\\x.html',
'C:\\z??.html']
>>> for i in glob.glob("C:\\*.html"):
...    f = open(i)
...
Traceback (most recent call last):
  File "<stdin>", line 2, in ?
IOError: [Errno 22] Invalid argument: 'C:\\z??.html'

   Neil