help with recursive whitespace filter in

MRAB google at
Sun May 10 19:35:59 CEST 2009

rustom wrote:
> On May 10, 9:49 pm, Steve Howell <showel... at> wrote:
>> On May 10, 9:10 am, Rustom Mody <rustompm... at> wrote:
>>> I am trying to write a recursive filter to remove whitespace-only
>>> nodes for minidom.
>>> The code is below.
>>> Strangely it deletes some whitespace nodes and leaves some.
>>> If I keep calling it -- like so: fws(fws(fws(doc)))  then at some
>>> stage all the ws nodes disappear
>>> Does anybody have a clue?
>>> from xml.dom.minidom import parse
>>> #The input to fws is the output of parse("something.xml")
>>> def fws(ele):
>>>     """ filter white space (recursive)"""
>>>    for c in ele.childNodes:
>>>         if isWsNode(c):
>>>             ele.removeChild(c)
>>>             #c.unlink() Makes no diff whether this is there or not
>>>         elif c.nodeType == ele.ELEMENT_NODE:
>>>             fws(c)
>>> def isWsNode(ele):
>>>     return (ele.nodeType == ele.TEXT_NODE and not
>> I would avoid doing things like delete/remove in a loop.  Instead
>> build a list of things to delete.
> Yeah I know. I would write the whole damn thing functionally if I knew
> how.  But cant figure out the API.
> I actually started out to write a (haskell-style) copy out the whole
> tree minus the unwanted nodes but could not figure it out
def fws(ele):
     """ filter white space (recursive)"""
     empty_nodes = []
     for c in ele.childNodes:
         if isWsNode(c):
         elif c.nodeType == ele.ELEMENT_NODE:
     for c in empty_nodes:

More information about the Python-list mailing list