[Python-Dev] Import and unicode: part two

Stephen J. Turnbull stephen at xemacs.org
Tue Jan 25 09:29:12 CET 2011


As Nick points out, nobody really seems to think this is an
argument against your patch.  I'm going to bow out of this thread
after this post, as I'm clearly out of my technical depth.

Victor Stinner writes:

 > Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a écrit :
 > > ... VFAT-formatted file systems and Shift JIS file names ...
 > 
 > I missed something: VFAT stores filenames as unicode (whereas FAT only 
 > supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
 > strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).

I don't know what it is; I didn't have char-device-level access to the
file system, nor did I have the specs (it was a proprietary phone by a
Japanese OEM).  It *presented* filenames in Shift JIS when mounted on
Linux with the vfat filesystem (either "mount -t vfat /dev/sde1
/mnt/gadget" or "mount -t auto /dev/sde1 /mnt/gadget").  Maybe there
is some unusual layer to translate from Unicode there, I'm not
familiar with Linux kernel drivers and libc facilities (such
special-casing is a common pattern in programming for Japanese;
remember, the Japanese had to deal with these issues before there was
any standard for them).

 > On which OS do you access this VFAT file system? On Windows, you have two 
 > APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
 > is explicit encoding at all. Linux has two mount options to control unicode on 
 > a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
 > "iocharset" for the unicode filenames (I don't understand this
 > option). 

I didn't either, in fact this is the first I've heard of it, so I've
never tried it.

 > I suppose that Shift JIS is used to encode the filename in the 8+3 byte string 
 > form.

Could be, but I'm pretty sure these were long filenames, although
maybe they were just short enough (that is, I don't recall noticing
any truncation when mounted compared to the way they were presented on
the phone itself).  I don't use that phone anymore, it's in a box of
junk equipment somewhere....


More information about the Python-Dev mailing list