Unicode error in sax parser

Rickard Lindberg ricli85 at gmail.com
Wed Feb 9 10:27:52 EST 2011


On Tue, Feb 8, 2011 at 5:41 PM, Chris Rebert <clp2 at rebertia.com> wrote:
> On Tue, Feb 8, 2011 at 7:57 AM, Rickard Lindberg <ricli85 at gmail.com> wrote:
>> Hi,
>>
>> Here is a bash script to reproduce my error:
>
> Including the error message and traceback is still helpful, for future
> reference.
>
>>    #!/bin/sh
>>
>>    cat > å.timeline <<EOF
> <snip>
>>    EOF
>>
>>    python <<EOF
>>    # encoding: utf-8
>>    from xml.sax import parse
>>    from xml.sax.handler import ContentHandler
>>    parse(u"å.timeline", ContentHandler())
>>    EOF
>>
>> If I instead do
>>
>>    parse(u"å.timeline".encode("utf-8"), ContentHandler())
>>
>> the script runs without errors.
>>
>> Is this a bug or expected behavior?
>
> Bug; open() figures out the filesystem encoding just fine.
> Bug tracker to report the issue to: http://bugs.python.org/
>
> Workaround:
> parse(open(u"å.timeline", 'r'), ContentHandler())
>
> Cheers,
> Chris

Bug reported at http://bugs.python.org/issue11159

-- 
Rickard Lindberg



More information about the Python-list mailing list