[XML-SIG] python, xml, html tags
Uche Ogbuji
Uche.Ogbuji at fourthought.com
Tue Mar 29 16:45:59 CEST 2005
On Mon, 2005-03-28 at 21:01 +0300, Necati DEMiR wrote:
> Hi,
> I can't do something with Python and XML.
>
> i have the following file;
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <test>
> <content> Hello </content>
> <content> <b> Hello </b> </content>
> </test>
>
> Ok. it is simple :)
>
> And i have the following python codes;
>
> #!/usr/bin/python
> from xml.dom import minidom
>
> file = open("test.xml","r")
> xml = minidom.parse(file)
> print xml.childNodes[0].getElementsByTagName("content")[0].firstChild.data
> print xml.childNodes[0].getElementsByTagName("content")[1].firstChild.data
>
> Again simple one :)
>
> But when i run these codes, i have the following output;
> Hello
>
> How can i access the second one.
DOM is not very good for this sort of thing. You could do:
print xml.getElementsByTagName("content")[0].firstChild.data
print xml.getElementsByTagName("content")[1].getElementsByTagName
("b").firstChild.data
But that's silly :-)
More useful thoughts below...
> Yes, i know it contains html tags so it
> doesn't give me the result.
Your b element happens to have the same name as one used in HTML, but
that doesn't really make it an HTML tag. In this case, it's clearly an
XML tag.
> I wanna get whole of the content as data.
> How can i do this?
Use something like the string_value function, listing 5 of the following
article:
http://www.xml.com/pub/a/2003/01/08/py-xml.html
Or use something with XPath support, which makes this easy. Using Amara
( http://www.xml.com/pub/a/2005/01/19/amara.html ), your code would be
from amara import binderytools
doc = binderytools.bind_file("test.xml")
print doc.xml_xpath(u'string(//content[1])')
print doc.xml_xpath(u'string(//content[2])')
Which prints
Hello
Hello
The string XPath function gets all text nodes, even within contained
elements. Notice that XPath uses 1 as the first index, while Python
uses 0.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html
Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html
Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
State of the art in XML modeling - http://www.ibm.com/developerworks/xml/library/x-think30.html
More information about the XML-SIG
mailing list