[Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

Victor Stinner victor.stinner at haypocalc.com
Tue Oct 25 22:18:13 CEST 2011


Le mardi 25 octobre 2011 00:57:42, Victor Stinner a écrit :
> I propose to raise Unicode errors if a filename cannot be decoded on
> Windows, instead of creating a bogus filenames with questions marks.
> Because this change is incompatible with Python 3.2, even if such
> filenames are unusable and I consider the problem as a (Python?) bug, I
> would like your opinion on such change before working on a patch.

Most people like the idea, so I wrote a patch and attached it to:

   http://bugs.python.org/issue13247

The patch only changes os.getcwdb() and os.listdir().

> We might use the PEP 383 to store undecoable bytes as surrogates (U+DC80-
> U+DCFF). But the situation is the opposite of the situtation on UNIX: on
> Windows, the problem is more on encoding (text->bytes) than on decoding
> (bytes->text). On UNIX, problems occur when the system is misconfigured
> (e.g. wrong locale encoding). On Windows, problems occur when your
> application uses the old (ANSI) API, whereas your filesystem is fully
> Unicode compliant and you created Unicode filenames with a program using
> the new (Windows) API.

I only changed functions returning filenames, so os.mkdir() is unchanged for 
example.

We may also patch the other functions to simplify the source code.

Victor


More information about the Python-Dev mailing list