[Python-checkins] cpython: whatsnew: XMLPullParser, plus some doc updates.

Nick Coghlan ncoghlan at gmail.com
Mon Jan 6 16:22:21 CET 2014


On 5 Jan 2014 12:54, "r.david.murray" <python-checkins at python.org> wrote:
>
> http://hg.python.org/cpython/rev/069f88f4935f
> changeset:   88308:069f88f4935f
> user:        R David Murray <rdmurray at bitdance.com>
> date:        Sat Jan 04 23:52:50 2014 -0500
> summary:
>   whatsnew: XMLPullParser, plus some doc updates.
>
> I was confused by the text saying that read_events "iterated", since it
> actually returns an iterator (that's what a generator does) that the
> caller must then iterate.  So I tidied up the language.  I'm not sure
> what the sentence "Events provided in a previous call to read_events()
> will not be yielded again." is trying to convey, so I didn't try to fix
that.

It's a mutating API - once the events have been retrieved, that's it,
they're gone from the internal state. Suggestions for wording improvements
welcome :)

Cheers,
Nick.

>
> Also fixed a couple more news items.
>
> files:
>   Doc/library/xml.etree.elementtree.rst |  23 +++++++++-----
>   Doc/whatsnew/3.4.rst                  |   7 ++-
>   Lib/xml/etree/ElementTree.py          |   2 +-
>   Misc/NEWS                             |  12 +++---
>   4 files changed, 25 insertions(+), 19 deletions(-)
>
>
> diff --git a/Doc/library/xml.etree.elementtree.rst
b/Doc/library/xml.etree.elementtree.rst
> --- a/Doc/library/xml.etree.elementtree.rst
> +++ b/Doc/library/xml.etree.elementtree.rst
> @@ -105,12 +105,15 @@
>     >>> root[0][1].text
>     '2008'
>
> +
> +.. _elementtree-pull-parsing:
> +
>  Pull API for non-blocking parsing
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> -Most parsing functions provided by this module require to read the whole
> -document at once before returning any result.  It is possible to use a
> -:class:`XMLParser` and feed data into it incrementally, but it's a push
API that
> +Most parsing functions provided by this module require the whole document
> +to be read at once before returning any result.  It is possible to use an
> +:class:`XMLParser` and feed data into it incrementally, but it is a push
API that
>  calls methods on a callback target, which is too low-level and
inconvenient for
>  most needs.  Sometimes what the user really wants is to be able to parse
XML
>  incrementally, without blocking operations, while enjoying the
convenience of
> @@ -119,7 +122,7 @@
>  The most powerful tool for doing this is :class:`XMLPullParser`.  It
does not
>  require a blocking read to obtain the XML data, and is instead fed with
data
>  incrementally with :meth:`XMLPullParser.feed` calls.  To get the parsed
XML
> -elements, call :meth:`XMLPullParser.read_events`.  Here's an example::
> +elements, call :meth:`XMLPullParser.read_events`.  Here is an example::
>
>     >>> parser = ET.XMLPullParser(['start', 'end'])
>     >>> parser.feed('<mytag>sometext')
> @@ -1038,15 +1041,17 @@
>
>     .. method:: read_events()
>
> -      Iterate over the events which have been encountered in the data
fed to the
> -      parser.  This method yields ``(event, elem)`` pairs, where *event*
is a
> +      Return an iterator over the events which have been encountered in
the
> +      data fed to the
> +      parser.  The iterator yields ``(event, elem)`` pairs, where
*event* is a
>        string representing the type of event (e.g. ``"end"``) and *elem*
is the
>        encountered :class:`Element` object.
>
>        Events provided in a previous call to :meth:`read_events` will not
be
> -      yielded again. As events are consumed from the internal queue only
as
> -      they are retrieved from the iterator, multiple readers calling
> -      :meth:`read_events` in parallel will have unpredictable results.
> +      yielded again.  Events are consumed from the internal queue only
when
> +      they are retrieved from the iterator, so multiple readers
iterating in
> +      parallel over iterators obtained from :meth:`read_events` will have
> +      unpredictable results.
>
>     .. note::
>
> diff --git a/Doc/whatsnew/3.4.rst b/Doc/whatsnew/3.4.rst
> --- a/Doc/whatsnew/3.4.rst
> +++ b/Doc/whatsnew/3.4.rst
> @@ -1088,9 +1088,10 @@
>  xml.etree
>  ---------
>
> -Add an event-driven parser for non-blocking applications,
> -:class:`~xml.etree.ElementTree.XMLPullParser`.
> -(Contributed by Antoine Pitrou in :issue:`17741`.)
> +A new parser, :class:`~xml.etree.ElementTree.XMLPullParser`, allows a
> +non-blocking applications to parse XML documents.  An example can be
> +seen at :ref:`elementtree-pull-parsing`.  (Contributed by Antoine
> +Pitrou in :issue:`17741`.)
>
>  The :mod:`xml.etree.ElementTree` :func:`~xml.etree.ElementTree.tostring`
and
>  :func:`~xml.etree.ElementTree.tostringlist` functions, and the
> diff --git a/Lib/xml/etree/ElementTree.py b/Lib/xml/etree/ElementTree.py
> --- a/Lib/xml/etree/ElementTree.py
> +++ b/Lib/xml/etree/ElementTree.py
> @@ -1251,7 +1251,7 @@
>          self._close_and_return_root()
>
>      def read_events(self):
> -        """Iterate over currently available (event, elem) pairs.
> +        """Return an iterator over currently available (event, elem)
pairs.
>
>          Events are consumed from the internal event queue as they are
>          retrieved from the iterator.
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -2193,14 +2193,14 @@
>  - Issue #17555: Fix ForkAwareThreadLock so that size of after fork
>    registry does not grow exponentially with generation of process.
>
> -- Issue #17707: multiprocessing.Queue's get() method does not block for
short
> -  timeouts.
> -
> -- Isuse #17720: Fix the Python implementation of pickle.Unpickler to
correctly
> +- Issue #17707: fix regression in multiprocessing.Queue's get() method
where
> +  it did not block for short timeouts.
> +
> +- Issue #17720: Fix the Python implementation of pickle.Unpickler to
correctly
>    process the APPENDS opcode when it is used on non-list objects.
>
> -- Issue #17012: shutil.which() no longer fallbacks to the PATH
environment
> -  variable if empty path argument is specified.  Patch by Serhiy
Storchaka.
> +- Issue #17012: shutil.which() no longer falls back to the PATH
environment
> +  variable if an empty path argument is specified.  Patch by Serhiy
Storchaka.
>
>  - Issue #17710: Fix pickle raising a SystemError on bogus input.
>
>
> --
> Repository URL: http://hg.python.org/cpython
>
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> https://mail.python.org/mailman/listinfo/python-checkins
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-checkins/attachments/20140107/5df1b6c3/attachment-0001.html>


More information about the Python-checkins mailing list