Unicode error in sax parser

Rickard Lindberg ricli85 at gmail.com
Tue Feb 8 16:57:47 CET 2011


Here is a bash script to reproduce my error:


    cat > å.timeline <<EOF
    <?xml version="1.0" encoding="utf-8"?>
          <start>2011-02-01 00:00:00</start>
          <end>2011-02-03 08:46:00</end>
          <start>2011-01-24 16:38:11</start>
          <end>2011-02-23 16:38:11</end>

    python <<EOF
    # encoding: utf-8
    from xml.sax import parse
    from xml.sax.handler import ContentHandler
    parse(u"å.timeline", ContentHandler())

If I instead do

    parse(u"å.timeline".encode("utf-8"), ContentHandler())

the script runs without errors.

Is this a bug or expected behavior?

Rickard Lindberg

More information about the Python-list mailing list