[Python-Dev] [PATCH][BUG] Segmentation Fault in xml.dom.minidom.parse
Evan Jones
ejones at uwaterloo.ca
Fri Sep 30 02:28:46 CEST 2005
The following Python script causes Python 2.3, 2.4 and the latest CVS
to crash with a Segmentation Fault:
import xml.dom.minidom
x = u'<?xml version="1.0"?>\n<fran\xe7ais>Comment \xe7a va ? Tr\xe8s
bien ?</fran\xe7ais>'
dom = xml.dom.minidom.parseString( x.encode( 'latin_1' ) )
print repr( dom.childNodes[0].localName )
The problem is that this XML document does not specify an encoding. In
this case, minidom assumes that it is encoded in UTF-8. However, in
fact it is encoded in Latin-1. My two line patch, in the SourceForge
tracker at the URL below, causes this to raise a UnicodeDecodingError
instead.
http://sourceforge.net/tracker/index.php?
func=detail&aid=1309009&group_id=5470&atid=305470
Any chance that someone wants to commit this tiny two line fix? This
might be the kind of fix that might be elegible to be backported to
Python 2.4 as well. It passes "make test" on both my Linux system and
my Mac. I've also attached a patch that adds this test case to
test_minidom.py.
Thanks,
Evan Jones
--
Evan Jones
http://evanjones.ca/
More information about the Python-Dev
mailing list