[Python-Dev] Python-3.0, unicode, and os.environ

Nick Coghlan ncoghlan at gmail.com
Sun Dec 7 22:49:41 CET 2008

Terry Reedy wrote:
> Toshio Kuratomi wrote:
>>   - If this is true, a definition of os.listdir(<type 'str'>) that would
>> better meet programmer expectation would be: "Give me all files in a
>> directory with the output as str type".  The definition of
>> os.listdir(<type 'bytes'>) would be "Give me all files in a directory
>> with the output as bytes type".  Raising an exception when the filenames
>> are undecodable is perfectly reasonable in this situation.
> Your examples (snipped) pretty well convince me that there is a use case
> for raising exceptions.  We should move beyond arguing over which one
> way is right.  I think there should be a second argument
> 'ignorebad=False' to ignore undecodable files rather than raise the
> exception (or 'strict=True' to stop and raise exception on non-decodable
> names -- then code is 'if strict: raise ...').  I believe other
> functions have a similar parameter.

If we were going to do anything like that for os.listdir() and other
filesystem APIs (like glob) that return multiple paths, we'd probably be
best advised to just have a normal Unicode 'errors' parameter which allowed:

'strict' - raise an Exception for malformed binary data
'replace' - insert '?' or some other symbol in place of malformed binary
'ignore' - simply leave out the malformed binary data
'skip' - run the underlying codec in strict mode, but skip over any
items which raise UnicodeDecodeError (default/current Py3k behaviour)

Obviously, 'skip' doesn't make any sense for APIs like getcwd() that
return a single value - a case could be made for those defaulting to
either replace or strict.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list