[Tutor] cdata/aml question..
Peter Otten
__peter__ at web.de
Sun Apr 13 10:56:29 CEST 2014
bruce wrote:
> The following text contains sample data. I'm simply trying to parse it
> using libxml2dom as the lib to extract data.
>
> As an example, to get the name/desc
>
> test data
> <class_meta_data><departments><department><name><![CDATA[A
> HTG]]></name><desc><![CDATA[American
> Heritage]]></desc></department><department><name><!
[CDATA[ACC]]></name><desc><![CDATA[Accounting]]></desc></department>
>
> d = libxml2dom.parseString(s, html=1)
>
> p1="//department/name"
> p2="//department/desc"
>
> pcount_ = d.xpath(p1)
> p2_ = d.xpath(p2)
> print str(len(pcount_))
> nba=0
>
> for a in pcount_:
> abbrv=a.nodeValue
> print abbrv
> abbrv=a.toString()
> print abbrv
> abbrv=a.textContent
> print abbrv
>
> neither of the above generates any of the CML name/desc data..
>
> any pointers on what I'm missing???
Your example seems to work here when I omit the html=1
d = libxml2dom.parseString(s)
...
> I can/have created a quick parse/split process to get the data, but I
> thought there'd be a straight forward process to extract the data
> using one of the py/libs..
One way using the stdlib:
from xml.etree import ElementTree as ET
#root = ET.parse(filename).getroot()
root = ET.fromstring(data)
for department in root.findall(".//department"):
name = department.find("name").text
desc = department.find("desc").text
print("{}: {}".format(name, desc))
More information about the Tutor
mailing list