On Tue, 2004-08-31 at 10:49, Brian Quinlan wrote:
> I'm trying to remove the whitespace-only text nodes in my XML DOM. I've 
> tried two approaches:
> 1. StripXml - generates a an exception:
>    File "mac.py", line 25, in __init__
>      StripXml(self.document)
>    File 
> "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/__init__.py", line 
> 153, in StripXml
>      snit = owner_doc.createNodeIterator(startNode, NodeFilter.SHOW_TEXT,
> AttributeError: Document instance has no attribute 'createNodeIterator'

StripXml only works on 4DOM nodes :-(

> 2. setFeature('whitespace_in_element_content', False) seems to do
>     nothing

What SAX parser?

> My code is here:
> from xml import xpath, dom
> from xml.dom.ext import StripXml
> from xml.dom.xmlbuilder import DOMInputSource, DOMBuilder
> from optparse import OptionParser
> from pprint import pprint
> import os
> b = DOMBuilder()
> b.setFeature('whitespace_in_element_content', False)
> self.document = b.parse(...)
> StripXml(self.document)
> My XML does not include a DTD or any declarations regarding whitespace. 
>   Can anyone offer any advice?

I usually use simple generator code for this sort of thing.  See


Using domtools from that article, or a more recent version of the


You could do something like (untested):

ws_only_nodes = domtools.doc_order_iter_filter(
  node, lambda n: n.nodeType == Node.TEXT_NODE and not n.strip()
for node in ws_only_nodes:

