[XML-SIG] Bug in exception handling?

Lars Marius Garshol larsga@ifi.uio.no
24 Jun 1999 14:31:53 +0200


* Rob Hooft
| 
| What I (by accident) did find is that it has something to do with
| Refcounting: The current code (drv_pyexpat) looks like:
| 
|         if not self.parser.Parse(fileobj.read(),1):
|             self.__report_error()
| 
| If I replace that by 
|         buf=fileobj.read()
|         if not self.parser.Parse(buf,1):
|             self.__report_error()
| 
| The exception does not dump core.

Aha! Thanks for this observation. I've checked your patch into my
driver source now, so it will be in the next release.

Once I finish my thesis I'll get this SAX work back on the rails
again. A JPython-compatible version, easySAX and SAX2 are all hovering
over me, but there's no time to do them properly now, and I think it's
better not to do them at all than to not do them properly.

Hopefully it's only a matter of weeks. Hopefully.
 
| The "by accident" I'm talking about is that I tried to eliminate the
| "sax" layer from the code, because in the profile listing of a test
| parse, the top routines were all in drv_pyexpat:

This isn't as surprising as it might be. I think the best solution
would be to have the drivers for expat and sgmlop be written entirely
in C.

| I think especially that:
| 
|     def startElement(self,name,attrs):
|         at = {}
|         for i in range(0, len(attrs), 2):
|             at[attrs[i]] = attrs[i+1]
|             
|         self.doc_handler.startElement(name,saxutils.AttributeMap(at))
| 
| is very expensive, as I'm not normally using the attributes on most of
| the elements. For me, a lazy version of AttributeMap would help a bit.

I had some spare time while waiting for my advisor now, so I wrote one
up for you. It's been tested a little, but not 100%. It's at:

<URL: http://birk105.studby.uio.no/tmp/drv_pyexpat.py>

If you want an even lazier driver you can use this one:

class LazyExpatDriver(SAX_expat):

    def __init__(self):
        SAX_expat.__init__(self)
        self.map=LazyAttributeMap([])
        
    def startElement(self,name,attrs):
        self.map.list=attrs
        self.doc_handler.startElement(name,self.map)    

Feedback on speed differences between these three drivers (original,
the one on the web and the one in this post) would be interesting.

--Lars M.