[Tutor] pyXML DOM 2.0 Traversal and filters
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Tue Apr 29 15:11:15 2003
On Tue, 29 Apr 2003, Levy Lazarre wrote:
> Given the following sample file ('appliances.xml'), I
> am trying to write a TreeWalker that would print out
> the element
> nodes, while a filter would prevent the nodes with a
> status of "broken" from displaying.
>
> <?xml version="1.0"?>
> <appliances>
> <clock status = "working">cuckoo</clock>
> <television status = "broken">black and
> white</television>
> </appliances>
>
> I am getting some exceptions, making me think that I
> am calling the filter the wrong way or I am missing
> something.
> Can somebody please point me to the right direction?
> Here is the sample code:
Hi Levy,
I haven't played with the filtering stuff yet (I'm more into
xml.dom.pulldom), but let's give it a shot! *grin*
Let's take a look at the code:
> from xml.dom.ext.reader import Sax2
> from xml.dom.NodeFilter import NodeFilter
>
> def filterbroken(thisNode):
> if (thisNode.nodeType == thisNode.ELEMENT_NODE and
> thisNode.getAttribute("status") == "broken"):
> return NodeFilter.FILTER_REJECT
> return NodeFilter.FILTER_ACCEPT
>
> reader = Sax2.Reader()
>
> input_file = file("appliances.xml")
> doc = reader.fromStream(input_file)
> walker = doc.createTreeWalker(doc.documentElement,
> NodeFilter.SHOW_ALL, filterbroken, 0)
Ok, let's stop at this point.
The error message we're getting:
> AttributeError: 'function' object has no attribute
> 'acceptNode'
is implying that the walker is thinking that filterbroken is some kind of
class instance: it may be trying to do something like:
filterbroken.acceptNode()
to call the filter. But let's double check the documentation on
createTreeWalker() and see what it expects:
http://pyxml.sourceforge.net/topics/howto/node22.html
Odd! According to the docs, it expects a function. But according to the
error message,
> "C:\Python22\Lib\site-packages\_xmlplus\dom\TreeWalker.py",
> line 168, in __checkFilter
> return self.__dict__['__filter'].acceptNode(node)
> AttributeError: 'function' object has no attribute
> 'acceptNode'
... it's trying to call an acceptNode() method. So something here is
definitly wrong. Either the code is wrong, or the documentation is wrong.
*grin*
And I expect it's the documentation. Published code that uses
createTreeWalker() does appear to pass in NodeFilter instances, and not
functions:
###
# (part of:
# http://aspn.activestate.com/ASPN/Mail/Message/XML-checkins/954448)
def checkWalkerOnlyTextNodesParentNodeFirstChildFilterSkipB(self):
class SkipBFilter(NodeFilter):
def acceptNode(self, node):
if node.nodeValue == 'B':
return self.FILTER_SKIP
else:
return self.FILTER_ACCEPT
walker = self.document.createTreeWalker(self.document,
NodeFilter.SHOW_TEXT, SkipBFilter(), 0)
###
So you may find that this will work:
###
class FilterBroken(NodeFilter):
def acceptNode(self, thisNode):
if (thisNode.nodeType == thisNode.ELEMENT_NODE and
thisNode.getAttribute("status") == "broken"):
return NodeFilter.FILTER_REJECT
return NodeFilter.FILTER_ACCEPT
reader = Sax2.Reader()
input_file = file("appliances.xml")
doc = reader.fromStream(input_file)
walker = doc.createTreeWalker(doc.documentElement,
NodeFilter.SHOW_ALL,
FilterBroken(), 0)
###
If this does do the trick, let's send a holler to the pyxml documentation
maintainers and get them to fix their documentation. *grin*
Hope this helps!