[Tutor] xml parsing

Ike Hall hall@nhn.ou.edu
Thu Nov 21 11:07:10 2002


Thank you Alfred,
This is very close to what I wish to do, (I also know nothing about XML,
just that this is the format the data I wish to do something with will come 
to me as).  In my case though, there are arbirtary levels of nesting, along 
with arbitrary keywords.  However, upon looking at both this example and a 
little bit of documentation of the re module, I think I may be able to figure 
out how to do this for the general situation if I can come to understand the 
re module somewhat correctly.

Thank you
Ike

On Wednesday 20 November 2002 08:34 pm, you wrote:
> Hi Ike:
>
> I don't know anything about XML, and I don't know whether the example you
> gave covers everything you want to do (for example, are there more levels
> of nesting or different keywords).
>
> However, using the string functions it is easy to develop the dictionaries
> you want as follows:
>
> import string
>
> def stringToDict (data, key) :
>      dict = {}
>      dataList = data.split(key)
>      for word in dataList :
>          if '<' not in word : break
>          if word.index('>') < word.index('<'):
>              word = word[word.index('>')+1:]
>          split = word.index('>')
>          key = word[:split].replace('<',"")
>          value = word[split+1:]
>          dict[key]=value
>      return dict
>
>
> XMLstring =
> "<block1><item1>1.0</item1><item2>1.234</item2></block1><block2><item1>6.4<
>/item1><item2>4</item2></block2>"
>
> XMLdictionary = stringToDict(XMLstring, '</block')
> print XMLdictionary
> for key in XMLdictionary :
>      XMLdictionary[key] = stringToDict(XMLdictionary[key], '</item')
> print XMLdictionary
>
> The output from this is:
>
> {'block2': {'item2': '4', 'item1': '6.4'}, 'block1': {'item2': '1.234',
> 'item1': '1.0'}}
>
> HTH
>
> Fred Milgrom
>
> At 02:24 PM 20/11/02 -0600, you wrote:
> >Hi all,
> >Im having a little trouble reading the documentation to some of the xml
> >parsing modules in order to get them to do what I want them to.
> >
> >Here is what I need to do:
> >I recieve an XML string of the form:
> >
> >'<block1>
> >      <item1>1.0</item1>
> >      <item2>1.234></item2>
> ></block1>
> ><block2>
> >      <item1>6.4</item1>
> >      <item2>4</item2>
> ></block2>'
> >
> >where I have placed indentations for clarity.  the string I recieve has no
> >whitespace or linebreaks.  What I want to do, is to turn this string into
> > a python dictionary that would look like this for that string:
> >
> >{'block1':{'item1':1.0,'item2':1.234},'block2':{'item1':6.4,'item2':4}}
> >
> >I have not been able to see clearly how to use the XML modules xmllib or
> >xml.sax in order to do this, and I was wondering how this may be
> > accomplished with a minimum of writing.
> >I know I could just write a function to parse this string using only
> > builtin python commands, but I do not think that this is the easiest
> > solution.
> >
> >Thanks
> >Ike
> >
> >_______________________________________________
> >Tutor maillist  -  Tutor@python.org
> >http://mail.python.org/mailman/listinfo/tutor