![](https://secure.gravatar.com/avatar/3acb8bae5a2b5a28f6fe522a4ea9b873.jpg?s=120&d=mm&r=g)
Guido van Rossum wrote:
Ah, sigh. I didn't know that os.listdir() behaves differently when the argument is Unicode. Does os.listdir(".") really behave differently than os.listdir(u".")? Bah! I don't think that's a very good design (although I see where it comes from). Promoting only those entries that need it seems the right solution
Unfortunately, this solution is hard to implement (I don't know whether it is implementable at all correctly; atleast on Windows, I see no way to implement it efficiently). Here are a number of problems/questions: - On Windows, should listdir use the narrow or the wide API? Obviously the wide API, since it is not Python which returns the question marks, but the Windows API. - But then, the wide API gives all results as Unicode. If you want to promote only those entries that need it, it really means that you only want to "demote" those that don't need it. But how can you tell whether an entry needs it? There is no API to find out. You could declare that anything with characters >128 needs it, but that would be an incompatible change: If a character >128 in the system code page is in a file name, listdir currently returns it in the system code page. It then would return a Unicode string. Applications relying on the olde behaviour would break. - On Unix, all file names come out as byte strings. Again, how do you know which ones to promote, and using what encoding? Python currently guesses an encoding, but that may or may not be the one intended for the file name. So the general "Bah!" doesn't really help much: when it comes to a specific algorithm to implement, the options are scarce. Regards, Martin