[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

STINNER Victor report at bugs.python.org
Fri Sep 10 12:42:59 CEST 2010


New submission from STINNER Victor <victor.stinner at haypocalc.com>:

In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now strict: raise an error on unencodable/undecodable characters/bytes. But os.listdir(b'.') encodes unencodable bytes as b'?'.

Example:

>>> os.mkdir('listdir')
>>> open('listdir\\xxx-\u0363', 'w').close()
>>> filename = os.listdir(b'listdir')[0]
>>> filename
b'xxx-?'
>>> open(filename, 'r').close()
IOError: [Errno 22] Invalid argument: 'xxx-?'

os.listdir(b'listdir') should raise an error (and not ignore the filename or replaces unencodable characters by b'?').

I think that we should list the directory using the wide character API (FindFirstFileW) but encode the filename using PyUnicode_EncodeFSDefault() if the directory name type is bytes, instead of using the ANSI API (FindFirstFileA).

----------
components: Library (Lib), Unicode, Windows
messages: 115995
nosy: haypo, loewis
priority: normal
severity: normal
status: open
title: Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames
versions: Python 3.2

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9820>
_______________________________________


More information about the Python-bugs-list mailing list