python and parsing an xml file

Tue Feb 22 04:03:58 EST 2011

Den 21.02.11 18.30, skrev Matt Funk:
> Hi,
> I was wondering if someone had some advice:
> I want to create a set of xml input files to my code that look as follows:
> <?xml version="1.0" encoding="UTF-8"?>
>
> <!-- Settings for the algorithm to be performed
>                                           -->
> <Algorithm>
>
>      <!-- The algorithm type.
>                                      -->
>      <!-- The supported options are:
>                                      -->
>      <!--     - Alg0
>                                          -->
>      <!--     - Alg1
>                                          -->
>      <Type>Alg1</Type>
>
>      <!-- the location/path of the input file for this algorithm
>                                          -->
>      <path>./Alg1.in</path>
>
> </Algorithm>
>
>
> <!-- Relevant information during the processing will be written to a
> logfile                                 -->
> <Logfile>
>
>      <!-- the location/path of the logfile (i.e. where to put the
> logfile)                                    -->
>      <path>c:\tmp</path>
>
>      <!-- verbosity level (i.e. how much to print)
>                                      -->
>      <!-- The supported options are:
>                                      -->
>      <!--     - 0    (nothing printed)
>                                          -->
>      <!--     - 1    (print on error)
>                                          -->
>      <verbosity>1</verbosity>
>
> </Logfile>
>
>
> So there are comments, whitespace etc ... in it.
> I would like to be able to put everything into some sort of structure
> such that i can access it as:
> structure['Algorithm']['Type'] == Alg1
> I was wondering if there is something out there that does this.
> I found and tried a few things:
> 1) http://code.activestate.com/recipes/534109-xml-to-python-data-structure/
> It simply doesn't work. I get the following error:
> raise exception
> xml.sax._exceptions.SAXParseException:<unknown>:1:2: not well-formed
> (invalid token)
> But i removed everything from the file except:<?xml version="1.0"
> encoding="UTF-8"?>
> and i still got the error.
>
> Anyway, i looked at ElementTree, but that error out with:
> xml.parsers.expat.ExpatError: junk after document element: line 19, column 0
>
>
> Anyway, if anyone can give me advice of point me somewhere i'd greatly
> appreciate it.
>
> thanks
> matt
>

How about skipping the whole xml thing? You can dynamically import any 
python module, even if it does not have a python filename. I show an 
example from my own code, slightly modified. I just hand the function a 
filename, and it tries to import the file. If the input file now 
contains variables like
algorithm = 'fast'
I can access the variables with
input = getinput('f.txt')
print input.algorithm.

Good luck,
Paul

+++++++++++++

import imp
def getinput(inputfilename):
     """Parse inputs to program from the given file."""
     try:
         # http://docs.python.org/library/imp.html#imp.load_source
         parameters = imp.load_source("parameters", inputfilename)
     except IOError as err:
         print >>sys.stderr, '%s: %s' % (str(err), inputfilename)
         print >>sys.stderr, 'The specified input file was not found - 
exiting'
         sys.exit(IO_ERROR)
     # Verify presence of all required input parameters, see input.py 
example file.
     required_parameter_names = ['setting1', 'setting2', 'setting3']
     msg = 'Required parameter name not found in input file: {0}'
     for parameter_name in required_parameter_names:
         assert hasattr(parameters, parameter_name), 
msg.format(parameter_name)
     return parameters