<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:Arial;
color:windowtext;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>I’m new to XML parsing and am using elementtree to parse
a very simple xml file that is similar to the one below:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><groceries><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <category name=”fruit”><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <item name=”apple” number=”8”/><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <item name=”banana” number=”12”/><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> </cagetory><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <category name=”frozen”><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <item name=”icecream” flavor=”chocolate”
number=”1”/><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> <item name=”pizza” make=”tombstone”
type=”cheese” number=”3”/><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'> </category><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'></groceries><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'>Basically I am just trying to parse out the values
and do something with them in my program. I’d like to define what the “correct”
syntax for this file is and have it checked automatically. For example, in
this simple example I’d like to check that only a single type of
subelement is allowed under <groceries>, namely <category> And
that a only a single type of sublement is allowed under <cagetory>,
namely <item>. Further, I’d like to check that <item> can
have the attributes ‘name’ and ‘number’, but no
others. Etc, etc.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'>Right now I am using elementtree to check this
within my program. IE, I traverse the tree and make sure the xml meets the
specifications. Based on my reading, it seems that I can use a DTD or scheme
to define the “grammar” of my xml, but I can’t figure out how
to actually set this up. It seems like I should just be able to specify my DTD
or Schema and then call ElementTree’s parse method and Elementtre should
just give me errors if the xml doesn’t conform to the rules. Maybe
Elementtree just doesn’t support this? I see some doc that says lxml
supports schema’s, will lxml do this for me?<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'>What are other folks doing to deal with this? It
seems like such an obvious need, yet when I google I don’t find anything
pointing me to a simple solution.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'>Thanks!<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'>Margie<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Courier New"><span style='font-size:10.0pt;
font-family:"Courier New"'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
</div>
</body>
</html>