[XML-SIG] Recipe 534109: XML to Python data structure
Stefan Behnel
stefan_ml at behnel.de
Wed Jan 7 13:42:21 CET 2009
David Shi wrote:
> What I am trying to do is to have a generic script to turn xml to Python
> dataset. Then I can manipulate it as required. Then I can save
> processed data into a .dbf file.
I'd use iterparse() for the parsing, that allows you to construct the .dbf
content on the fly.
http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk
Working with the data elements returned by the iterparse iterator is quite
easy, you'll be fine with using the properties .tag and .text, as well as
the .find() method to find subelements.
http://codespeak.net/lxml/tutorial.html#the-element-class
If you can afford to load the entire XML tree into memory, you can also
try lxml.objectify, which will give you a Python-like interface to the
data.
http://codespeak.net/lxml/objectify.html
Note that the lxml.objectify in-memory tree is most likely a lot more
memory friendly (and the parsing is definitely faster) than what the
recipe gives you.
Stefan
More information about the XML-SIG
mailing list