[issue15478] UnicodeDecodeError on OSError
report at bugs.python.org
Sat Jul 28 14:49:06 CEST 2012
New submission from STINNER Victor <victor.stinner at gmail.com>:
On Windows, if an OS error fails, the filename type is bytes and the filename cannot be decoded: Python raises an UnicodeDecodeError instead of an OSError. The problem is that Python decodes the filename to fill OSError.filename field. See the issue #15441 for the initial report.
There are different options to solve this issue:
- always keep the filename parameter unchanged, so OSError.filename can be a str or a bytes string, depending on the input parameter
- try to decode the filename from the filesystem encoding, or keep the filename unchanged: OSError.filename is only a bytes string if the filename cannot be decoded
- don't fill OSError.filename (= None) if the filename cannot be decoded
- use "surrogateescape", "replace" or "backslashreplace" error handler to decode the filename
This issue is specific to Windows: on other plaforms, the filename is decoded using the "surrogateescape" error handler and so decoding the filename cannot fail.
I don't know if OSError.filename is only used to display more information to the user, or if it is used to do another operation on the file (ex: os.chmod).
I like solutions keeping the filename unchanged, because it does not loose information, and the user can decide how to handle the undecodable filename.
I don't like the option trying to decode the filename or keeping it unchanged it decoding fails, because applications will work in most cases, but "crash" when someone comes with an unusual code page, a special USB key, or a filename with a non-ASCII character.
So the best option is maybe to always keep the bytes filename unchanged.
Such change cannot be done anymore in Python 3.3, it's too late to test it correctly.
components: Unicode, Windows
nosy: ezio.melotti, flox, haypo, ishimoto, loewis, tim.golden
title: UnicodeDecodeError on OSError
versions: Python 3.4
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list