[XML-SIG] Parse MULTIPLE XML files in a directory

Stefan Behnel stefan_ml at behnel.de
Fri Aug 10 09:53:13 CEST 2007


Hi,

first thing: don't use expat directly. Use (c)ElementTree's iterparse. It's in
Python 2.5, but is also available as an external package for older Python
versions. There's also lxml (which is mostly compatible to ElementTree), in
case you ever need features like XPath, XSLT or whatever.


amitesh kumar wrote:
> Please review the following code and help me.
> 
> Here I'm trying to :
> 1. Read each XML file in a folder.
> 2. Parse file.
> 3. Store some of the tags values as key-value pair in a map
> 4. Similarly maintain another collection that'll store one list per file.
> ------------------------------------------------------------------------
>
> ordtags = set()
> shptags = set()
> omptags = set()
> 
> ordtags.add('orrfnbr')
> ordtags.add('afidlog')
[...]

Better:

    ordtags = set(['offfnbr', 'afidlog', ...])

    from xml.etree.cElementTree import iterparse

    for onefile in allfiles:
        for event, element in iterparse(onefile):
            if element.tag in ordtags:
                 # do something like
                 values[element.tag] = element.text
            elif element.tag in shptags:
                 # do something else
            else:
                 # don't do anything?
            element.clear()

Stefan


More information about the XML-SIG mailing list