[issue9241] SAXParseError on unicode (Japanese) file
Gianfranco
report at bugs.python.org
Tue Jul 13 11:04:36 CEST 2010
New submission from Gianfranco <gianzula at tin.it>:
When parsing a UTF-16 little-endian encoded XML file containing some japanese characters, the xml.sax.parse function raises a SAXParseException exception saying "no element found". Problem arises with/on:
Python 2.5.2/Windows XP Pro SP3 32 bit
Python 2.6.4/Windows XP Pro SP3 32 bit
Python 2.5.2/Windows 2008 Server SP2 64 bit
The same file is successfully processed with/on:
Python 2.4.3/CentOS 5.4
Python 2.6.3/CentOS 5.4
I've attached a minimal XML file that contains a single U+FF1A japanese character that triggers the exception. Code for parsing the file follows:
import xml.sax
xml.sax.parse(open("ff1a.xml"), xml.sax.ContentHandler())
Best regards,
Gianfranco
----------
components: XML
files: ff1a.xml
messages: 110163
nosy: gianzula
priority: normal
severity: normal
status: open
title: SAXParseError on unicode (Japanese) file
type: behavior
versions: Python 2.5
Added file: http://bugs.python.org/file17979/ff1a.xml
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9241>
_______________________________________
More information about the Python-bugs-list
mailing list