Unicode error in sax parser

Rickard Lindberg ricli85 at gmail.com
Tue Feb 8 16:57:47 CET 2011


Hi,

Here is a bash script to reproduce my error:

    #!/bin/sh

    cat > å.timeline <<EOF
    <?xml version="1.0" encoding="utf-8"?>
    <timeline>
      <version>0.13.0devb38ace0a572b+</version>
      <categories>
      </categories>
      <events>
        <event>
          <start>2011-02-01 00:00:00</start>
          <end>2011-02-03 08:46:00</end>
          <text>asdsd</text>
        </event>
      </events>
      <view>
        <displayed_period>
          <start>2011-01-24 16:38:11</start>
          <end>2011-02-23 16:38:11</end>
        </displayed_period>
        <hidden_categories>
        </hidden_categories>
      </view>
    </timeline>
    EOF

    python <<EOF
    # encoding: utf-8
    from xml.sax import parse
    from xml.sax.handler import ContentHandler
    parse(u"å.timeline", ContentHandler())
    EOF

If I instead do

    parse(u"å.timeline".encode("utf-8"), ContentHandler())

the script runs without errors.

Is this a bug or expected behavior?

-- 
Rickard Lindberg



More information about the Python-list mailing list