unicode filenames

Andrew Dalke adalke at mindspring.com
Tue Feb 4 08:15:42 CET 2003


Paul Boddie wrote:
> I hadn't heard of 'od' before, so this is a useful piece of
> information. When accessing Red Hat Linux 7.3 on Intel with locale as
> en_US.iso885915, I can apparently create filenames with ISO-8859-15
> characters, and in the terminal program I'm using, these characters
> appear as question marks when switching locale to en_US.utf8. However,
> in the former locale, 'od -c' returns the characters as part of the
> "dump", whereas in the latter, 'od -c' returns the octal codes for
> those characters.

It doesn't look like I can handle LANG=en_US.utf8 very well

 >>> s = u"1 to \N{INFINITY}"
 >>> s.encode("utf8")
'1 to \xe2\x88\x9e'
 >>> t = s.encode("utf8")
 >>> os.mkdir(t)
 >>> ^D
[dalke at zebulon src]$ ls -ld 1*
drwxr-xr-x    2 dalke    users        4096 Feb  3 23:33 1 to â

    (note that the end of the filaname shows two empty boxes on my
screen, which is what my terminal uses when it can't show the
right character.)

[dalke at zebulon src]$ echo 1* | od -cd
0000000 2031 6f74 e220 9e88 000a
           1       t   o     342 210 236  \n  \0
0000011
[dalke at zebulon src]$

(Bleh.  this is a little-endian machine, so the hex
characters should be "31 20 74 6f 20 e2 88 9e 0a 00" when
interpreted as characters.  So you can see the characters
are exactly as was in the original string, which was the
UTF-8 encoding of the filename.  If I use the unicode string
directlry I get a 'cannot encode as ASCII' error.)

When I start nautilus with LANG=en_US.utf8 I get

[dalke at zebulon src]$ nautilus .

Gdk-WARNING **: locale not supported by Xlib, locale set to C

  ...

When I start Konqueror

[dalke at zebulon src]$ konqueror
Qt: Locales not supported on X server
qstring_to_xtp result code -2


I'm on RedHat 7.2, so it may be that 7.3 improves unicode support.

This is harder than I want it to be.  Python!  Make it just
work for me!  :)

					Andrew
					dalke at dalkescientific.com





More information about the Python-list mailing list