[Python-checkins] cpython (merge 3.2 -> default): Issue #14202: Greatly enhance the documentation of xml.dom.pulldom.

eli.bendersky python-checkins at python.org
Fri Mar 16 13:42:03 CET 2012


http://hg.python.org/cpython/rev/5d118a154ba3
changeset:   75732:5d118a154ba3
parent:      75730:870c0ef7e8a2
parent:      75731:172630a3e6d8
user:        Eli Bendersky <eliben at gmail.com>
date:        Fri Mar 16 14:40:13 2012 +0200
summary:
  Issue #14202: Greatly enhance the documentation of xml.dom.pulldom.
Patch by Florian Mladitsch

files:
  Doc/library/xml.dom.pulldom.rst |  84 +++++++++++++++++----
  1 files changed, 68 insertions(+), 16 deletions(-)


diff --git a/Doc/library/xml.dom.pulldom.rst b/Doc/library/xml.dom.pulldom.rst
--- a/Doc/library/xml.dom.pulldom.rst
+++ b/Doc/library/xml.dom.pulldom.rst
@@ -9,33 +9,72 @@
 
 --------------
 
-:mod:`xml.dom.pulldom` allows building only selected portions of a Document
-Object Model representation of a document from SAX events.
+The :mod:`xml.dom.pulldom` module provides a "pull parser" which can also be
+asked to produce DOM-accessible fragments of the document where necessary. The
+basic concept involves pulling "events" from a stream of incoming XML and
+processing them. In contrast to SAX which also employs an event-driven
+processing model together with callbacks, the user of a pull parser is
+responsible for explicitly pulling events from the stream, looping over those
+events until either processing is finished or an error condition occurs.
 
+Example::
 
-.. class:: PullDOM(documentFactory=None)
+   from xml.dom import pulldom
 
-   :class:`xml.sax.handler.ContentHandler` implementation that ...
+   doc = pulldom.parse('sales_items.xml')
+   for event, node in doc:
+       if event == pulldom.START_ELEMENT and node.tagName == 'item':
+           if int(node.getAttribute('price')) > 50:
+               doc.expandNode(node)
+               print(node.toxml())
 
+``event`` is a constant and can be one of:
 
-.. class:: DOMEventStream(stream, parser, bufsize)
+* :data:`START_ELEMENT`
+* :data:`END_ELEMENT`
+* :data:`COMMENT`
+* :data:`START_DOCUMENT`
+* :data:`END_DOCUMENT`
+* :data:`CHARACTERS`
+* :data:`PROCESSING_INSTRUCTION`
+* :data:`IGNORABLE_WHITESPACE`
 
-   ...
+``node`` is a object of type :class:`xml.dom.minidom.Document`,
+:class:`xml.dom.minidom.Element` or :class:`xml.dom.minidom.Text`.
+
+Since the document is treated as a "flat" stream of events, the document "tree"
+is implicitly traversed and the desired elements are found regardless of their
+depth in the tree. In other words, one does not need to consider hierarchical issues
+such as recursive searching of the document nodes, although if the context of
+elements were important, one would either need to maintain some context-related
+state (ie. remembering where one is in the document at any given point) or to
+make use of the :func:`DOMEventStream.expandNode` method and switch to DOM-related processing.
+
+
+.. class:: PullDom(documentFactory=None)
+
+   Subclass of :class:`xml.sax.handler.ContentHandler`.
 
 
 .. class:: SAX2DOM(documentFactory=None)
 
-   :class:`xml.sax.handler.ContentHandler` implementation that ...
+   Subclass of :class:`xml.sax.handler.ContentHandler`.
 
 
 .. function:: parse(stream_or_string, parser=None, bufsize=None)
 
-   ...
+   Return a :class:`DOMEventStream` from the given input. *stream_or_string* may be
+   either a file name, or a file-like object. *parser*, if given, must be a
+   :class:`XmlReader` object. This function will change the document handler of the
+   parser and activate namespace support; other parser configuration (like
+   setting an entity resolver) must have been done in advance.
+
+If you have XML in a string, you can use the :func:`parseString` function instead:
 
 
 .. function:: parseString(string, parser=None)
 
-   ...
+   Return a :class:`DOMEventStream` that represents the (unicode) *string*.
 
 
 .. data:: default_bufsize
@@ -51,18 +90,31 @@
 DOMEventStream Objects
 ----------------------
 
+.. class:: DOMEventStream(stream, parser, bufsize)
 
-.. method:: DOMEventStream.getEvent()
 
-   ...
+   .. method:: DOMEventStream.getEvent()
 
+      Return a tuple containing *event* and the current *node* as
+      :class:`xml.dom.minidom.Document` if event equals START_DOCUMENT,
+      :class:`xml.dom.minidom.Element` if event equals START_ELEMENT or
+      END_ELEMENT or :class:`xml.dom.minidom.Text` if event equals CHARACTERS.
+      The current node does not contain informations about its children, unless
+      :func:`expandNode` is called.
 
-.. method:: DOMEventStream.expandNode(node)
+   .. method:: DOMEventStream.expandNode(node)
 
-   ...
+      Expands all children of *node* into *node*. Example::
 
+          xml = '<html><title>Foo</title> <p>Some text <div>and more</div></p> </html>'
+          doc = pulldom.parseString(xml)
+          for event, node in doc:
+              if event == pulldom.START_ELEMENT and node.tagName == 'p':
+                  # Following statement only prints '<p/>'
+                  print(node.toxml())
+                  doc.exandNode(node)
+                  # Following statement prints node with all its children '<p>Some text <div>and more</div></p>'
+                  print(node.toxml())
 
-.. method:: DOMEventStream.reset()
+   .. method:: DOMEventStream.reset()
 
-   ...
-

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list