[Python-Dev] Unicode strings as filenames
Martin v. Loewis
martin@v.loewis.de
Sun, 6 Jan 2002 01:33:08 +0100
> This change works for me on Windows 2000 and allows access to all files
> no matter what the current code page is set to. On Windows 9x (not yet
> tested), the _wfopen call should fail causing a fallback to fopen. Possibly
> the OS should be detected instead and _wfopen not attempted on 9x.
Now that you have that change, please try to extend it to
posixmodule.c. This is where I gave up. Notice that, with changing
Py_FileSystemDefaultEncoding and open() alone, you have worsened the
situation: os.stat will now fail on files with non-ASCII names on
which it works under the mbcs encoding, because windows won't find the
file (correct me if I'm wrong).
> On 9x, mbcs may be a better choice of encoding although it may also
> be possible to ask the file system to find the wide character file
> name and return the mangled short name that can then be used by
> fopen.
It is not just 9x: if you have ten (*) different APIs to open a file, 10
different APIs to stat a file, and so on, and have to select some of
them at compile time, and some of them at run-time, it gets messy very
quickly.
(*) I'd expect that other systems may also have proprietary system
calls to do these things, using either wchar_t* or a proprietary
Unicode type.
> The best approach to me seems to be to make
> Py_FileSystemDefaultEncoding settable by the user, at least allowing
> the choice between 'utf-8' and 'mbcs' with a default of 'utf-8' on
> NT and 'mbcs' on 9x.
By the user, or by the application? How can the application make a
more educated guess than Python proper? Alternatively, how can the
user (or her Administrator) know what value to put in there?
On Windows, probably neither is a good idea; if the file system
default encoding is used in the future, fixing it at mbcs is the best
I can think of.
> Please criticise any stylistic or correctness issues in the code
> as it is my first modification to the Python sources.
The code looks fine. I'd encourage you to continue on that topic; just
expect that it will need many more rounds for completion.
Regards,
Martin