[issue17902] Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser
New submission from Aaron Oakley: It would really help to document that the C API can only use the default xml.etree.ElementTree.TreeBuilder for targets with iterparse (and by extension, IncrementalParser). I got a nice surprise about that when I went from 3.2 to 3.3 and started getting "TypeError: event handling only supported for ElementTree.TreeBuilder targets". I included a patch to add notes to iterparse and IncrementalParser, but I'm not sure what to refer to the C module as since xml.etree.cElementTree is deprecated. ---------- assignee: docs@python components: Documentation, XML files: elementtree.rst-340a0.patch keywords: patch messages: 188329 nosy: Aaron.Oakley, docs@python priority: normal severity: normal status: open title: Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser type: behavior versions: Python 3.4 Added file: http://bugs.python.org/file30119/elementtree.rst-340a0.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Changes by Antoine Pitrou <pitrou@free.fr>: ---------- nosy: +eli.bendersky stage: -> patch review versions: +Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Eli Bendersky added the comment: Aaron, could you please sign the PSF CLA (http://www.python.org/psf/contrib/contrib-form/) - this will make it accepting patches from you easier. Other than that, I agree it's a legit patch. The alternative would be to fix _elementtree to actually allow arbitrary TreeBuilders there, although I'm not sure it's worth the effort. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Aaron Oakley added the comment: So sorry, I just found the emails from the bug tracker in my spam folder. Anyhow, I've now signed the CLA. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Roundup Robot added the comment: New changeset a5a5ba4f71ad by Eli Bendersky in branch '3.3': Issue #17902: Clarify doc of ElementTree.iterparse http://hg.python.org/cpython/rev/a5a5ba4f71ad New changeset 96f45011957e by Eli Bendersky in branch 'default': Issue #17902: Clarify doc of ElementTree.iterparse and IncrementalParser http://hg.python.org/cpython/rev/96f45011957e ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Changes by Eli Bendersky <eliben@gmail.com>: ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Eli Bendersky added the comment: Aaron - could you describe your use case of passing a custom parser into iterparse? We're currently considering deprecating the feature of passing a parser into iterparse in a future release (this is being discussed in issue 17741). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Aaron Oakley added the comment:
From memory, the use case at the time was using a custom TreeBuilder sub-class fed into a builtin XMLParser object. The code would construct a builder separately and keep a reference to it around. The builder would delegate calls to start(), data(), end(), and close() to super and save the completed tree when its close() was called.
my_builder = CustomTreeBuilder() et_parser = ET.XMLParser(target=my_builder) for (evt, elem) in ET.iterparse("...", events, parser=et_parser): pass # Do first processing tree = my_builder.root # Saved tree It was done like this initially so that some data (I can't recall exactly what) from the XML input could be processed first very conveniently using the parse events from iterparse while allowing the whole tree to be retrieved afterwards. That said, the project later moved to using lxml for various features not contained in xml.etree.ElementTree, and I don't think the process I described is still being used. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
Eli Bendersky added the comment: On Mon, Aug 26, 2013 at 7:11 PM, Aaron Oakley <report@bugs.python.org>wrote:
Aaron Oakley added the comment:
From memory, the use case at the time was using a custom TreeBuilder sub-class fed into a builtin XMLParser object. The code would construct a builder separately and keep a reference to it around. The builder would delegate calls to start(), data(), end(), and close() to super and save the completed tree when its close() was called.
my_builder = CustomTreeBuilder() et_parser = ET.XMLParser(target=my_builder)
for (evt, elem) in ET.iterparse("...", events, parser=et_parser): pass # Do first processing
tree = my_builder.root # Saved tree
It was done like this initially so that some data (I can't recall exactly what) from the XML input could be processed first very conveniently using the parse events from iterparse while allowing the whole tree to be retrieved afterwards.
That said, the project later moved to using lxml for various features not contained in xml.etree.ElementTree, and I don't think the process I described is still being used.
Thanks for the information, Aaron; much appreciated. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue17902> _______________________________________
participants (4)
-
Aaron Oakley
-
Antoine Pitrou
-
Eli Bendersky
-
Roundup Robot