[Guido asks good questions about how Windows deals w/ Unicode filenames, last Thursday, but gets no answers]
... I'd like to solve this problem, but I have some questions: what *IS* the encoding used for filenames on Windows? This may differ per Windows version; perhaps it can differ drive letter? Or per application or per thread? On Windows NT, filenames are supposed to be Unicode. (I suppose also on Windowns 2000?) How do I open a file with a given Unicode string for its name, in a C program? I suppose there's a Win32 API call for that which has a Unicode variant.
On Windows 95/98, the Unicode variants of the Win32 API calls don't exist. So what is the poor Python runtime to do there?
Can Japanese people use Japanese characters in filenames on Windows 95/98? Let's assume they can. Since the filesystem isn't Unicode aware, the filenames must be encoded. Which encoding is used? Let's assume they use Microsoft's multibyte encoding. If they put such a file on a floppy and ship it to Linköping, what will Fredrik see as the filename? (I.e., is the encoding fixed by the disk volume, or by the operating system?)
Once we have a few answers here, we can solve the problem. Note that sometimes we'll have to refuse a Unicode filename because there's no mapping for some of the characters it contains in the filename encoding used.
I just thought I'd repeat the questions <wink>. However, I don't think you'll really want the answers -- Windows is a legacy-encrusted mess, and there are always many ways to get a thing done in the end. For example ...
Question: how does Fredrik create a file with a Euro character (u'\u20ac') in its name?
This particular one is shallower than you were hoping: in many of the TrueType fonts (e.g., Courier New but not Courier), Windows extended its Latin-1 encoding by mapping the Euro symbol to the "control character" 0x80. So I can get a Euro symbol into a file name just by typing Alt+0+1+2+8. This is true even on US Win98 (which has no visible Unicode support) -- but was not supported in US Win95. i've-been-tracking-down-what-appears-to-be-a-hw-bug-on-a-japanese-laptop- at-work-so-can-verify-ms-sure-got-japanese-characters-into-the- filenames-somehow-but-doubt-it's-via-unicode-ly y'rs - tim