[Python-Dev] test_unicode_file failing on Mac OS X
Jack Jansen
Jack.Jansen at cwi.nl
Sun Dec 7 11:32:23 EST 2003
On 6-dec-03, at 18:48, Skip Montanaro wrote:
> Two of the test_unicode_file began failing on my Mac today (fresh cvs
> up, OS
> X 10.2.8, vanilla unix-style build):
>
>
> ======================================================================
> FAIL: test_directories (__main__.TestUnicodeFiles)
>
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "../Lib/test/test_unicode_file.py", line 155, in
> test_directories
> self._do_directory(TESTFN_ENCODED+ext, TESTFN_ENCODED+ext,
> os.getcwd)
> File "../Lib/test/test_unicode_file.py", line 103, in
> _do_directory
> make_name)
> AssertionError: '@test-a\xcc\x80o\xcc\x80.dir' !=
> '@test-\xc3\xa0\xc3\xb2.dir'
This is probably related to the two flavors of unicode there are, one
which prefers to have all accents separately from the letters as much
as possible and one which prefers the reverse. I keep forgetting the
names of the two, they're somewhat silly.
But the problem is that Python prefers to represent the string "ä" as
the two characters "a" and "umlaut on the previous char", and MacOSX
prefers to represent the same string as "a with umlaut on it". Or the
other way around, this is something else I always forget.
And while there are algorithms to convert the combined form of unicode
to the uncombined form and vice versa there are no Python codecs to do
this. The OSX system calls do the right thing (convert both forms to
what it prefers), but when you do a readdir() you don't get the string
back you put it.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma
Goldman
More information about the Python-Dev
mailing list