LC_ALL and os.listdir()

Duncan Booth duncan.booth at invalid.invalid
Thu Feb 24 05:57:37 EST 2005


Martin v. Löwis wrote:

> Serge Orlov wrote:
>> Shouldn't os.path.join do that? If you pass a unicode string
>> and a byte string it currently tries to convert bytes to characters
>> but it makes more sense to convert the unicode string to bytes
>> and return two byte strings concatenated.
> 
> Sounds reasonable. OTOH, this would be the only (one of a very
> few?) occasion where Python combines byte+unicode => byte.
> Furthermore, it might be that the conversion of the Unicode
> string to a file name fails as well.
> 
> That said, I still think it is a good idea, so contributions
> are welcome.
> 
It would probably mess up those systems where filenames really are unicode 
strings and not byte sequences.

Windows (when using NTFS) stores all the filenames in unicode, and Python 
uses the unicode api to implement listdir (when given a unicode path). This 
means that the filename never gets encoded to a byte string either by the 
OS or Python. If you use a byte string path than the filename gets encoded 
by Windows and Python just returns what it is given.



More information about the Python-list mailing list