os.lisdir, gets unicode, returns unicode... USUALLY?!?!?

gabor gabor at nekomancer.net
Sun Nov 19 21:06:51 CET 2006

Martin v. Löwis wrote:
> gabor schrieb:
>> depends on the application. in the one where it happened i would just
>> display an error message, and tell the admins to check the
>> filesystem-encoding.
>> (in other ones, where it's not critical to get the correct name, i would
>> probably just convert the text to unicode using the "replace" behavior)
>> what about using flags similar to how unicode() works? strict, ignore,
>> replace and maybe keep-as-bytestring.
>> like:
>> os.listdir(dirname,'strict')
>> i know it's not the most elegant, but it would solve most of the
>> use-cases imho (at least my use-cases).
> Of course, it's possible to implement this on top of the existing
> listdir operation.
> def failing_listdir(dirname, mode):
>   result = os.listdir(dirname)
>   if mode != 'strict': return result
>   for r in result:
>     if isinstance(r, str):
>       raise UnicodeDecodeError
>   return result

yes, sure... but then.. it's possible to implement it also on top of an 
raise-when-error version :)

so, what do you think, how should this issue be solved?

currently i see 2 ways:

1. simply fix the documentation, and state that if the file-name cannot 
be decoded into unicode, then it's returned as byte-string. but that 
also means, that the typical usage of:

[os.path.join(path,n) for n in os.listdir(path)]

will not work.

2. add support for some unicode-decoding flags, like i wrote before

3. some solution.



More information about the Python-list mailing list