i18n: looking for expertise

klappnase klappnase at web.de
Thu Mar 10 10:58:52 EST 2005


"stewart.midwinter at gmail.com" <stewart.midwinter at gmail.com> wrote in message news:<1110384648.779852.81580 at o13g2000cwo.googlegroups.com>...
> Michael:
> 
> on my box, (winXP SP2), sys.getfilesystemencoding() returns 'mbcs'.

Oh, from the reading docs I had thought XP would use unicode:

* On Windows 9x, the encoding is ``mbcs''. 
* On Mac OS X, the encoding is ``utf-8''. 
* On Unix, the encoding is the user's preference according to the
result of nl_langinfo(CODESET), or None if the nl_langinfo(CODESET)
failed.
* On Windows NT+, file names are Unicode natively, so no conversion is
performed.

Maybe that's for compatibility between different Windows flavors.

> 
> If you post your revised solution to this unicode problem, I'd be
> delighted to test it on Windows.  I'm working on a Tkinter front-end
> for Vivian deSmedt's rsync.py and would like to address the issue of
> accented characters in folder names.
> 
> thanks
> Stewart 
> stewart dot midwinter at gmail dot com

I wrote it for use with linux only, and it looks like using the system
encoding as I try to guess it in my UnicodeHandler module (see the
first post) is fine there.

When on windows the filesystemencoding differs from what I get in
UnicodeHandler.sysencoding I guess I would have to define separate
convenience methods for decoding/encoding filenames with sysencoding
replaced with sys.getfilesystemencoding()( I found the need for these
convenience methods when I discovered that some strings I used were
sometimes unicode and sometimes not, and I have a lot of interactions
between several modules which makes it hard to track which I have
sometimes).

Tk seems to be pretty smart on handling unicode, so using unicode for
everything that's displayed on tk widgets should be ok (I hope).

So filling a listbox with the contents of a directory "pathname" looks
like this:

pathname = fsencode(pathname)# make sure it's a byte string, for
python2.2 compatibility
flist = map(fsdecode, os.listdir(pathname))
flist.sort()
for item in flist:
    listbox.insert('end', item)

For file operations I have written a separate module which defines
convenience methods like these:

##########################################

def remove_ok(self, filename, verbose=1):
    b, u = fsencode(filename), fsdecode(filename)
    if not os.path.exists(b):
        if verbose:
            # popup a dialog box, similar to tkMessageBox
            MsgBox.showerror(parent=self.parent, message=_('File not
found:\n"%s"') % u)
        return 0
    elif os.path.isdir(b):
        if verbose:
            MsgBox.showerror(parent=self.parent, message=_('Cannot
delete "%s":\nis a directory') % u)
        return 0
    if not os.access(os.path.dirname(b), os.W_OK):
        if verbose:
            MsgBox.showerror(parent=self.parent, message=_('Cannot
delete "%s":\npermission denied.') % u)
        return 0
    return 1
    
def remove(self, filename, verbose=1):
    b, u = fsencode(filename), fsdecode(filename)
    if self.remove_ok(filename, verbose=verbose):
        try:
            os.remove(b)
            return 1
        except:
            if verbose:
                MsgBox.showerror(parent=self.parent, message=_('Cannot
delete "%s":\npermission denied.') % u)
    return 0

###################################

It looks like you don't need to do any encoding of filenames however,
if you use python2.3 (at least as long as you don't have to call
os.access() ), but I want my code to run with python2.2 ,too.

I hope this answers your question. Unfortunately I cannot post all of
my code here, because it's quite a lot of files, but the basic concept
is still the same as I wrote in the first post.

Best regards

Michael



More information about the Python-list mailing list