expat error, help to debug?

Andreas Lobinger newsreturns at biszumknie.de
Thu Aug 23 17:06:40 CEST 2007


Aloha,

i'm trying to write an xml filter, that extracts some info about
an .xml document (with external entities), esp. start elements and
external entities. The document is a DOCBOOK xml and afacs
well formed and passes our docbook toolchain (dblatex etc.).

My parser is (very simple):
[115] scylla(scylla)> more pbxml.py

class xmlhandle:
     def __init__(self):
         self.parser_stack = [];
         self.parser = None;

     def se(self,name,attr):
         print "s", self.parser.CurrentLineNumber, name, attr

     def ex(self,context,baseid,n1,n2):
         print "x",context,n1,n2

def fromxml(fname):
     import xml.parsers.expat
     p = xml.parsers.expat.ParserCreate()
     xl = xmlhandle()
     p.StartElementHandler = xl.se
     p.ExternalEntityRefHandler = xl.ex
     xl.parser = p
     p.ParseFile(file(fname))
     return

if __name__ == "__main__":
    import sys
    fromxml(sys.argv[1])

my document (in 2 parts):

[116] scylla(scylla)> more s3.xml
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  "/usr/share/xml/docbook/xml/4.2/docbookx.dtd"
[
<!ENTITY bookinfo SYSTEM "bookinfo.xml">
]>
<book>
&bookinfo;
<chapter id="technicalDescription"><title>technical description</title>
         <para>
         This chapter includes specification of the main simulation loop.
         </para>
</chapter>
</book>

[118] scylla(scylla)> more bookinfo.xml
<bookinfo>
   <title>BookTitle</title>
   <authorgroup>
     <author>
         <firstname>A</firstname>
         <surname>B</surname>
     </author>
   </authorgroup>
</bookinfo>

The run produces:

[120] scylla(scylla)> python pbxml.py s3.xml
s 7 book {}
x bookinfo bookinfo.xml None
s 9 chapter {u'id': u'technicalDescription'}
s 9 title {}
s 10 para {}
Traceback (most recent call last):
   File "pbxml.py", line 25, in ?
     fromxml(sys.argv[1])
   File "pbxml.py", line 20, in fromxml
     p.ParseFile(file(fname))
TypeError: an integer is required

Anyone any idea where the error is produced?
Anyone any idea how to debug(? if it's really a bug or
missunderstanding of expate) this?

Hoping for an answer and wishing a happy day,
		LOBI



More information about the Python-list mailing list