[Tutor] xml parsing

Alfred Milgrom fredm@smartypantsco.com
Wed Nov 20 20:35:02 2002


Hi Ike:

I don't know anything about XML, and I don't know whether the example you 
gave covers everything you want to do (for example, are there more levels 
of nesting or different keywords).

However, using the string functions it is easy to develop the dictionaries 
you want as follows:

import string

def stringToDict (data, key) :
     dict = {}
     dataList = data.split(key)
     for word in dataList :
         if '<' not in word : break
         if word.index('>') < word.index('<'):
             word = word[word.index('>')+1:]
         split = word.index('>')
         key = word[:split].replace('<',"")
         value = word[split+1:]
         dict[key]=value
     return dict


XMLstring = 
"<block1><item1>1.0</item1><item2>1.234</item2></block1><block2><item1>6.4</item1><item2>4</item2></block2>"

XMLdictionary = stringToDict(XMLstring, '</block')
print XMLdictionary
for key in XMLdictionary :
     XMLdictionary[key] = stringToDict(XMLdictionary[key], '</item')
print XMLdictionary

The output from this is:

{'block2': {'item2': '4', 'item1': '6.4'}, 'block1': {'item2': '1.234', 
'item1': '1.0'}}

HTH

Fred Milgrom



At 02:24 PM 20/11/02 -0600, you wrote:
>Hi all,
>Im having a little trouble reading the documentation to some of the xml
>parsing modules in order to get them to do what I want them to.
>
>Here is what I need to do:
>I recieve an XML string of the form:
>
>'<block1>
>      <item1>1.0</item1>
>      <item2>1.234></item2>
></block1>
><block2>
>      <item1>6.4</item1>
>      <item2>4</item2>
></block2>'
>
>where I have placed indentations for clarity.  the string I recieve has no
>whitespace or linebreaks.  What I want to do, is to turn this string into a
>python dictionary that would look like this for that string:
>
>{'block1':{'item1':1.0,'item2':1.234},'block2':{'item1':6.4,'item2':4}}
>
>I have not been able to see clearly how to use the XML modules xmllib or
>xml.sax in order to do this, and I was wondering how this may be accomplished
>with a minimum of writing.
>I know I could just write a function to parse this string using only builtin
>python commands, but I do not think that this is the easiest solution.
>
>Thanks
>Ike
>
>_______________________________________________
>Tutor maillist  -  Tutor@python.org
>http://mail.python.org/mailman/listinfo/tutor