[XML-SIG] lxml iterparse and comments
smcg4191 at frii.com
Mon Mar 24 04:56:59 CET 2008
I am probably mising something elementary (I am new
to both xml and lxml), but I am having problems figuring
out how to get comments when using lxml's iterparse().
When I parse xml with parse() and iterate though the
result, I get the comments. But when I try to do the
same thing (approximately I think) with iterparse,
I don't see any comments. See example code below.
(I was using the standard Python ElementTree but my
understanding is that it doesn't save comments at all.
If that's wrong I would go back to using it).
The real file is ~50MB and has about 1M nodes under the
root so I have to use iterparse and I also have to process
comments, so I would really appreciate a clue about how
to do it. Thanks.
import lxml.etree as ET
from cStringIO import StringIO
# XML data...
xmltxt = \
'''<?xml version="1.0" encoding="UTF-8"?>
<!-- Rev 1.06
<!DOCTYPE Test [
<!ELEMENT Test (entry*)>
<!ELEMENT entry ANY>
<!-- Description of <entry> element.
<!-- File created: 2008-02-27 -->
<!-- Chronosynclastic Infindibulum Listing -->
<!-- Deleted: A1500477 -->
et = ET.parse( StringIO (xmltxt))
for elem in et.iter():
xx = ET.iterparse( StringIO (xmltxt), ("start","end"))
for event, elem in iter(xx):
print event, elem
More information about the XML-SIG