[Python-Dev] Vacation and possibly a new bug

Skip Montanaro skip@pobox.com
Tue, 15 Jul 2003 21:42:32 -0500


>>>>> "Martin" =3D=3D Martin v L=F6wis <martin@v.loewis.de> writes:

    Martin> Skip Montanaro <skip@pobox.com> writes:
    >> Yeah, me.  I can't do anything this evening, but can look at it =
tomorrow.
    >> What's the bug id?

    Martin> No bug id: test_pep277 fails.

One problem seems obvious to me.  test_pep277.UnicodeFileTests defines =
this
method:

    def test_listdir(self):
        f1 =3D os.listdir(test_support.TESTFN)
        f1.sort()
        f2 =3D os.listdir(unicode(test_support.TESTFN,"mbcs"))
        f2.sort()
        print f1
        print f2

The unicode() call winds up trying to call codecs.mbcs_encode which is
imported from the _codecs module.  mbcs_encode is only defined on Windo=
ws,
and only if there is a usable wchar_t.  It's clear this is going to fai=
l on
Mac OS X.

This seems better:

    def test_listdir(self):
=09import sys
        f1 =3D os.listdir(test_support.TESTFN)
        f1.sort()
        f2 =3D os.listdir(unicode(test_support.TESTFN,
                                sys.getfilesystemencoding()))
        f2.sort()
        print f1
        print f2

If someone with ready access to a Windows machine can try that change
tonight I'll check it in, otherwise it will have to wait until I'm at w=
ork
tomorrow morning.

The second error is due to details.filename at line 53 being a plain st=
ring
containing non-ASCII characters.  When it is
'not_@test/Gr\xc3\xbc\xc3\x9f-Gott' and is compared with filename which=
 is
u'not_@test/Gr\xc3\xbc\xc3\x9f-Gott' a UnicodeDecodeError is raised try=
ing
to coerce details.filename to unicode.  Simply converting it using
unicode(details.filename, sys.getfilesystemencoding()) before compariso=
n
doesn't seem correct, because there's no guarantee that details.filenam=
e is
in the file system encoding at that point.  It certainly fails for me:

    >>> s =3D 'not_@test/Gr\xfc\xdf-Gott'
    >>> import sys
    >>> unicode(s, sys.getfilesystemencoding())
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeDecodeError: 'utf8' codec can't decode bytes in position 12-=
17: unsupported Unicode code range

I suspect details.filename needs to be set to a unicode object, but I d=
on't
know where or how.

Skip