Unicode error in sax parser

Chris Rebert clp2 at rebertia.com
Tue Feb 8 11:41:08 EST 2011


On Tue, Feb 8, 2011 at 7:57 AM, Rickard Lindberg <ricli85 at gmail.com> wrote:
> Hi,
>
> Here is a bash script to reproduce my error:

Including the error message and traceback is still helpful, for future
reference.

>    #!/bin/sh
>
>    cat > å.timeline <<EOF
<snip>
>    EOF
>
>    python <<EOF
>    # encoding: utf-8
>    from xml.sax import parse
>    from xml.sax.handler import ContentHandler
>    parse(u"å.timeline", ContentHandler())
>    EOF
>
> If I instead do
>
>    parse(u"å.timeline".encode("utf-8"), ContentHandler())
>
> the script runs without errors.
>
> Is this a bug or expected behavior?

Bug; open() figures out the filesystem encoding just fine.
Bug tracker to report the issue to: http://bugs.python.org/

Workaround:
parse(open(u"å.timeline", 'r'), ContentHandler())

Cheers,
Chris



More information about the Python-list mailing list