[Python-Dev] Import and unicode: part two

Victor Stinner vstinner at edenwall.com
Mon Jan 24 16:39:39 CET 2011


Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a écrit :
> ... VFAT-formatted file systems and Shift JIS file names ...

I missed something: VFAT stores filenames as unicode (whereas FAT only 
supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).

On which OS do you access this VFAT file system? On Windows, you have two 
APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
is explicit encoding at all. Linux has two mount options to control unicode on 
a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
"iocharset" for the unicode filenames (I don't understand this option). 
Anyway, both systems support unicode filenames.

I suppose that Shift JIS is used to encode the filename in the 8+3 byte string 
form.

Victor


More information about the Python-Dev mailing list