[New-bugs-announce] [issue11159] Sax parser crashes if given unicode file name

Rickard Lindberg report at bugs.python.org
Wed Feb 9 15:20:03 CET 2011

New submission from Rickard Lindberg <ricli85 at gmail.com>:

The error is the following:

    Traceback (most recent call last):
      File "<stdin>", line 4, in <module>
      File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/__init__.py", line 31, in parse
      File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", line 109, in parse
        xmlreader.IncrementalParser.parse(self, source)
      File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/xmlreader.py", line 119, in parse
      File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", line 121, in prepareParser
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 0: ordinal not in range(128)

The following bash script can be used to reproduce the error:


    cat > å.timeline <<EOF
    <?xml version="1.0" encoding="utf-8"?>
          <start>2011-02-01 00:00:00</start>
          <end>2011-02-03 08:46:00</end>
          <start>2011-01-24 16:38:11</start>
          <end>2011-02-23 16:38:11</end>

    python <<EOF
    # encoding: utf-8
    from xml.sax import parse
    from xml.sax.handler import ContentHandler
    parse(open(u"å.timeline", 'r'), ContentHandler())

If I instead do this, it works fine:

    parse(u"å.timeline".encode("utf-8"), ContentHandler())


    >>> sys.getfilesystemencoding()

I heard from another user that this was not a problem with Python 3.1.2.

components: XML
messages: 128212
nosy: ricli85
priority: normal
severity: normal
status: open
title: Sax parser crashes if given unicode file name
type: crash
versions: Python 2.7

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list