[Tutor] ElementTree, iterable container, depth of elements

Peter Otten __peter__ at web.de
Sat Mar 29 18:00:27 CET 2014


street.sweeper at mailworks.org wrote:

> I'm trying to sort the order of elements in an xml file, mostly
> to make visual inspection/comparison easier.  The example xml and
> code on http://effbot.org/zone/element-sort.htm get me almost
> what I need, but the xml I'm working with has the element I'm
> trying to sort on one level deeper.
> 
> 
> That page's example xml:
> 
> <phonebook>
>   <entries>
>     <entry>
>       <name>Ned</name>
>       <number>555-8904</number>
>     </entry>
>     <entry>
>       <name>John</name>
>       <number>555-5782</number>
>     </entry>
>     <entry>
>       <name>Julius</name>
>       <number>555-3642</number>
>     </entry>
>   </entries>
> </phonebook>
> 
> 
> And that page's last example of code:
> 
>   import xml.etree.ElementTree as ET
>   tree = ET.parse("data.xml")
>   def getkey(elem):
>     return elem.findtext("number")
>   container = tree.find("entries")
>   container[:] = sorted(container,key=getkey)
>   tree.write("new-data.xml")
> 
> I used the interactive shell to experiment a bit with that,
> and I can see that 'container' in
> 
>   container = tree.find("entries")
> 
> is iterable, using
> 
>   for a in container:
>     print(a)
> 
> However, the xml I'm working with looks something like this:
> 
> <root>
>   <main>
>     <diary>
>       <entry>
>         <Date>20140325</Date>
>         <appointment>dentist</appointment>
>       </entry>
>       <entry>
>         <Date>20140324</Date>
>         <appointment>barber</appointment>
>       </entry>
>     </diary>
>   </main>
> </root>
> 
> 
> What I'd like to do is rearrange the <entry> elements within
> <diary> based on the <Date> element.  If I remove the <root>
> level, this will work, but I'm interested in getting the code to
> work without editing the file.
> 
> I look for "Date" and "diary" rather than "number" and "entries"
> but when I try to process the file as-is, I get an error like
> 
> 
> Traceback (most recent call last):
>   File "./xmlSort.py", line 16, in <module>
>     container[:] = sorted(container, key=getkey)
> TypeError: 'NoneType' object is not iterable
> 
> 
> "container[:] = sorted(container, key=getkey)" confuses me,
> particularly because I don't see how the elem parameter is passed
> to the getkey function.

In the original example container is the "entries" element, and sorted() 
iterates over the items of its first argument. Iteration over an element 
yield its children, i. e. the first "entry" element, then the second 
"entry", and so on.
 
> I know if I do
> 
>   root = tree.getroot()
> 
> (from the python.org ElementTree docs) it is possible to step
> down through the levels of root with root[0], root[0][0], etc,
> and it seems to be possible to iterate with
> 
>   for i in root[0][0]:
>     print(i)
> 
> but trying to work root[0][0] into the code has not worked,
> and tree[0] is not possible.
> 
> How can I get this code to do its work one level down in the xml?

try

tree.find("main").find("diary") 

or

container = tree.find("main/diary")

or even 

tree.find(".//diary") # we don't care about the parent




More information about the Tutor mailing list