help with recursive whitespace filter in

Steve Howell showell30 at yahoo.com
Sun May 10 20:17:23 EDT 2009


On May 10, 10:23 am, rustom <rustompm... at gmail.com> wrote:
> On May 10, 9:49 pm, Steve Howell <showel... at yahoo.com> wrote:
>
>
>
> > On May 10, 9:10 am, Rustom Mody <rustompm... at gmail.com> wrote:
>
> > > I am trying to write a recursive filter to remove whitespace-only
> > > nodes for minidom.
> > > The code is below.
>
> > > Strangely it deletes some whitespace nodes and leaves some.
> > > If I keep calling it -- like so: fws(fws(fws(doc)))  then at some
> > > stage all the ws nodes disappear
>
> > > Does anybody have a clue?
>
> > > from xml.dom.minidom import parse
>
> > > #The input to fws is the output of parse("something.xml")
>
> > > def fws(ele):
> > >     """ filter white space (recursive)"""
>
> > >    for c in ele.childNodes:
> > >         if isWsNode(c):
> > >             ele.removeChild(c)
> > >             #c.unlink() Makes no diff whether this is there or not
> > >         elif c.nodeType == ele.ELEMENT_NODE:
> > >             fws(c)
>
> > > def isWsNode(ele):
> > >     return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())
>
> > I would avoid doing things like delete/remove in a loop.  Instead
> > build a list of things to delete.
>
> Yeah I know. I would write the whole damn thing functionally if I knew
> how.  But cant figure out the API.
> I actually started out to write a (haskell-style) copy out the whole
> tree minus the unwanted nodes but could not figure it out

You can use list comprehensions for a more functional style.

Instead of deleting the unwanted nodes in place, try to create new
lists of just the wanted results.



More information about the Python-list mailing list