[XML-SIG] lxml iterparse and comments
stefan_ml at behnel.de
Tue Mar 25 22:04:02 CET 2008
Stuart McGraw wrote:
>> Stuart McGraw wrote:
>> > I am probably mising something elementary (I am new
>> > to both xml and lxml), but I am having problems figuring
>> > out how to get comments when using lxml's iterparse().
>> > When I parse xml with parse() and iterate though the
>> > result, I get the comments. But when I try to do the
>> > same thing (approximately I think) with iterparse,
>> > I don't see any comments.
>> While the comments end up in the tree that iterparse generates, they
>> do not show up in the events. Now that you mention it, I
>> actually think that should change. There should be events
>> "comment" and "pi" that yield them if requested.
> That would be ideal, from my perspective. It also seems
> more consistent with the other interfaces (parse, parse target,
Implemented on the trunk, will be in lxml 2.1.
>> Have you tried the parser target interface?
> I am having trouble getting it to work. Specifically, the test
> code below produces the output I expected when run with
> cElementTree, but with lxml, it is missing "end" callbacks,
> the second "start(entry) " callback, and the resolved entity
> text. Am I doing something wrong?
> Test code:
> #import xml.etree.cElementTree as ET
> import lxml.etree as ET
> from cStringIO import StringIO
> # XML data...
> xmltxt = \
> '''<?xml version="1.0" encoding="UTF-8"?>
> <!-- Rev 1.06
> <!DOCTYPE Test [
> <!ELEMENT Test (entry*)>
> <!ELEMENT entry (#PCDATA)>
> <!-- Description of <entry> element.
> <!ENTITY ex "an existential entity">
> <!-- File created: 2008-02-27 -->
> <!-- Chronosynclastic Infindibulum Listing -->
> <entry>text 1 is &ex;</entry>
> <!-- Deleted: A1500477 -->
> <entry>text 2</entry>
> print '\nTargetParser:\n-------------'
> try: XMLParser = ET.XMLParser
> except AttributeError: XMLParser = ET.XMLTreeBuilder
> class EchoTarget:
> def comment(self, tag):
> print "comment", tag
> def start(self, tag, attrib):
> print "start", tag, attrib
> def end(self, tag):
> print "end", tag
> def data(self, data):
> print "data", repr(data)
> def close(self):
> print "close"
> return "closed!"
> parser = XMLParser( target = EchoTarget())
> result = ET.parse( StringIO (xmltxt), parser)
I can reproduce that. Seems to require an entity reference in the data,
though. I'll look into it.
More information about the XML-SIG