[issue11159] Sax parser crashes if given unicode file name
Rickard Lindberg
report at bugs.python.org
Wed Feb 9 15:20:03 CET 2011
New submission from Rickard Lindberg <ricli85 at gmail.com>:
The error is the following:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/__init__.py", line 31, in parse
parser.parse(filename_or_stream)
File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", line 109, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/xmlreader.py", line 119, in parse
self.prepareParser(source)
File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", line 121, in prepareParser
self._parser.SetBase(source.getSystemId())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 0: ordinal not in range(128)
The following bash script can be used to reproduce the error:
#!/bin/sh
cat > å.timeline <<EOF
<?xml version="1.0" encoding="utf-8"?>
<timeline>
<version>0.13.0devb38ace0a572b+</version>
<categories>
</categories>
<events>
<event>
<start>2011-02-01 00:00:00</start>
<end>2011-02-03 08:46:00</end>
<text>asdsd</text>
</event>
</events>
<view>
<displayed_period>
<start>2011-01-24 16:38:11</start>
<end>2011-02-23 16:38:11</end>
</displayed_period>
<hidden_categories>
</hidden_categories>
</view>
</timeline>
EOF
python <<EOF
# encoding: utf-8
from xml.sax import parse
from xml.sax.handler import ContentHandler
parse(open(u"å.timeline", 'r'), ContentHandler())
EOF
If I instead do this, it works fine:
parse(u"å.timeline".encode("utf-8"), ContentHandler())
Also:
>>> sys.getfilesystemencoding()
'UTF-8'
I heard from another user that this was not a problem with Python 3.1.2.
----------
components: XML
messages: 128212
nosy: ricli85
priority: normal
severity: normal
status: open
title: Sax parser crashes if given unicode file name
type: crash
versions: Python 2.7
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11159>
_______________________________________
More information about the Python-bugs-list
mailing list