Extract CDATA Node

Tue Feb 24 08:38:37 EST 2009

On Tue, 24 Feb 2009 05:29:21 -0800 (PST), Girish <girish.cfc at gmail.com> wrote:
>How do I extract CDATA node in Python? I'm using dom.minidom as
>follows:-
>
>from xml.dom.minidom import Document
>
>class XMLDocument():
>
>    def __init__(self):
>        self.doc  = Document()
>
>    def parseString(self, d):
>        self.doc = parseString(_encode(d))
>        return self
>
>#Create instance of XMLDocument
>doc = XMLDocument()
>doc.parseString(open(os.curdir + '\\XML\\1.xml', 'r').read())
>.....
>
>Please help me out.

Here's one approach.

    from xml.dom.minidom import parse
    doc = parse(file('XML/1.xml'))
    cdata = []
    elements = [doc.documentElement]
    while elements:
        e = elements.pop(0)
        if e.nodeType == doc.TEXT_NODE:
            cdata.append(e.data)
        elif e.nodeType == doc.ELEMENT_NODE:
            elements[:0] = e.childNodes
    print cdata

I bet there are simpler ways, though, based on other XML libraries.

Jean-Paul