[lxml-dev] comment processing (bug?)
I am having a problem with embedded comments not being ignored by the parser.
From the examples on iterparse and iterwalk
http://codespeak.net/lxml/parsing.html I tried the examples and it seems to work as advertised (comments ignored unless 'comment' is in events). However, I changed the input slightly to use an embedded comment: commented_xml = ''' <root> <element key='value'>text</element> <!-- a comment --> </root>''' xml = etree.XML (commented_xml) context = etree.iterwalk (xml, events = ('start','end')) for action, elem in context: print("%s: -%s-" % (action, elem.tag)) start: -root- start: -element- end: -element- start: -<built-in function Comment>- end: -<built-in function Comment>- end: -root- Since events does not contain comments, I wasn't expecting the comment and seem to unable to filter it out. Any pointers appreciated.. Kris
kristian kvilekval, 08.06.2010 21:26:
I am having a problem with embedded comments not being ignored by the parser.
Note that iterwalk() is not related to the parser, it just walks the tree.
From the examples on iterparse and iterwalk
http://codespeak.net/lxml/parsing.html
I tried the examples and it seems to work as advertised (comments ignored unless 'comment' is in events).
However, I changed the input slightly to use an embedded comment:
commented_xml = ''' <root> <element key='value'>text</element> <!-- a comment --> </root>'''
xml = etree.XML (commented_xml) context = etree.iterwalk (xml, events = ('start','end')) for action, elem in context: print("%s: -%s-" % (action, elem.tag))
start: -root- start: -element- end: -element- start: -<built-in function Comment>- end: -<built-in function Comment>- end: -root-
Since events does not contain comments, I wasn't expecting the comment and seem to unable to filter it out. Any pointers appreciated..
Yes, that's a bug. Comments and PIs shouldn't be returned unless explicitly requested. That's the behaviour of iterparse(), and iterwalk() should behave the same. I'll see if I can fix that for 2.3. Could you please file a bug report in the launchpad tracker? Thanks! Stefan
participants (2)
-
kristian kvilekval
-
Stefan Behnel