elementtree and gbk encoding

Steven Bethard steven.bethard at gmail.com
Wed Mar 15 01:27:17 CET 2006


Diez B. Roggisch wrote:
>> Here's what I get with the prepending hack:
>>
>>  >>> et.fromstring('<?xml version="1.0" encoding="gbk"?>\n' + 
>> open(filename).read())
>> Traceback (most recent call last):
>>   File "<interactive input>", line 1, in ?
>>   File "C:\Program 
>> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960, 
>> in XML
>>     parser.feed(text)
>>   File "C:\Program 
>> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 1242, 
>> in feed
>>     self._parser.Parse(data, 0)
>> ExpatError: unknown encoding: line 1, column 30
>>
>>
>> Are the XML encoding names different from the Python ones?  The "gbk" 
>> encoding seems to work okay from Python:
> 
> I had similar trouble with cElementTree and cp1252 encodings. But 
> upgrading to a more recent version helped. Did you try parsing with e.g. 
>  sax?

Hmm...  The builtin xml.dom.minidom and xml.sax both also fail to find 
the encoding:


 >>> import xml.dom.minidom as dom
 >>> dom.parseString('<?xml version="1.0" encoding="gbk"?>' + 
open(filename).read())
Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\dom\minidom.py", line 1925, in 
parseString
     return expatbuilder.parseString(string)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 942, 
in parseString
     return builder.parseString(string)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 223, 
in parseString
     parser.Parse(string, True)
ExpatError: unknown encoding: line 1, column 30


 >>> import xml.sax as sax
 >>> sax.parseString('<?xml version="1.0" encoding="gbk"?>' + 
open(filename).read(), sax.handler.ContentHandler())
Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\sax\__init__.py", line 47, in 
parseString
     parser.parse(inpsrc)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 109, 
in parse
     xmlreader.IncrementalParser.parse(self, source)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\sax\xmlreader.py", line 123, in 
parse
     self.feed(buffer)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 220, 
in feed
     self._err_handler.fatalError(exc)
   File "C:\Program 
Files\Python\lib\site-packages\_xmlplus\sax\handler.py", line 38, in 
fatalError
     raise exception
SAXParseException: <unknown>:1:30: unknown encoding



More information about the Python-list mailing list