[Tutor] parsing XML into a python dictionary
Stefan Behnel
stefan_ml at behnel.de
Sun Nov 15 14:45:55 CET 2009
Christopher Spears, 14.11.2009 19:47:
> Thanks! I have a lot of XML files at work that users search through. I
> want to parse the XML into a python dictionary and then read the dictionary
> into a database that users can use to search through the thousands of files.
I think "database" is the right keyword here. Depending on how large your
"thousands of files" are and what the actual content of each file is, a
full-text search engine (e.g. pylucene) or an XML database might be the
right tool, instead of trying to write something up yourself.
If you want to use something that's in Python's standard library, consider
parsing the XML files as a stream instead of a document tree (look for the
iterparse() function in lxml.etree or the xml.etree.ElementTree package),
and safe the extracted data into a sqlite3 database.
You can also use such a database as a kind of cache that stores relevant
information for each file, and update that information whenever you notice
that a file has been modified.
Stefan
More information about the Tutor
mailing list