[Python-Dev] xml.etree.ElementTree.IncrementalParser (was: ElementTree iterparse string)

Antoine Pitrou solipsis at pitrou.net
Thu Aug 8 10:20:39 CEST 2013


Hi,

Le Thu, 08 Aug 2013 06:33:42 +0200,
Stefan Behnel <stefan_ml at behnel.de> a écrit :
> [from python-ideas]
> 
> Antoine Pitrou, 07.08.2013 08:04:
> > Take a look at IncrementalParser:
> > http://docs.python.org/dev/library/xml.etree.elementtree.html#incremental-parsing
> 
> Hmm, that seems to be a somewhat recent addition (April 2013). I
> would have preferred hearing about it before it got added.
> 
> I don't like the fact that it adds a second interface to iterparse()
> that allows injecting arbitrary content into the parser.
> You can now
> run iterparse() to read from a file, and at an arbitrary iteration
> position, send it a byte string to parse from, before it goes reading
> more data from the file. Or take out some events before iteration
> continues.
> 
> I think the implementation should be changed to make iterparse()
> return something that wraps an IncrementalParser, not something that
> is an IncrementalParser.

That sounds reasonable. Do you want to post a patch? :-)

> Also, IMO it should mimic the interface of the TreeBuilder, which
> calls the data reception method "data()" and the termination method
> "close()". There is no reason to add yet another set of methods names
> just to do what others do already.

Well, the difference here is that after calling eof_received() you can
still (and should) call events() once to get the last events. I think
it would be weird if you could still do something useful with the object
after calling close().

Also, the method names are not invented, they mimick the PEP 3156
stream protocols:
http://www.python.org/dev/peps/pep-3156/#stream-protocols

Regards

Antoine.




More information about the Python-Dev mailing list