[XML-SIG] Help Needed (Will pay if someone is interested)

Stefan Behnel stefan_ml at behnel.de
Mon Jul 2 17:12:31 CEST 2007



Robert Rawlins - Think Blue wrote:
> I’m looking for some help with XML parsing, I’ve been playing around
> with this over the past few days and the only solution I can come up
> with seems to be a little slow and also leaves what I think is a memory
> leak in my application, which causes all kinds of problems.
> 
>  
> 
> I have a very simple XML file which I need to loop over the elements and
> extract the attribute information from, but the loop must be conditional
> as the attributes must meet a certain criteria.
> 
>  
> 
> My current solution is using minidom,

That's not the solution, that's the problem. Use cElementTree.


> which I’ve read isn’t one of the
> better parsers, if anyone knows of any that are better for the task I
> would love to hear it, the content is extracted regularly so whatever we
> chose needs to be quick, and validation isn’t so important. Take a look
> at this brief example of the XML we’re dealing with:
> 
>  
> 
> <schedules name="Default event" location="this is the location of the
> event">
> 
>                 <event name="This is an event" location="At my house"
> type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" />
> 
>                 <event name="And Another" location="At work" type="2"
> start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" />
> 
>                 <event name="This is some more" location="At the cafe"
> type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" />
> 
>                 <event name="And one last one" location="At my house"
> type="3" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" />
> 
> </schedules>
> 
>  
> 
> Now this file details events which are possibly going to occur over the
> next couple of weeks. Now what I need to do is have a function which is
> called ‘getCurrentEvent()’ which will return any events that should be
> occurring at this point in time, or now().

  from xml.etree import celementtree as et # Python 2.5

  # untested
  search_date = "2007-02-03 00:00:00"
  for _, element in et.iterparse("event-file.xml"):
      if element.tag == event:
          start = element.get("start")
          end   = element.get("end")
          if start > search_date:
               continue
          if end != start and end < search_date:
               continue

      print et.tostring(element)

or something like that. You'll love the performance.


> The ‘Type’ attribute details
> how often the event it likely to reoccur, 1 being daily, 2 being weekly
> and so on, if no elements are found which are occurring in this time and
> date then I would like it to return the default event which is defined
> in the attributes of the ‘schedules’ tag.

That's much harder, as it requires real date calculation in general. Are you
sure you want an XML tree as a database? Why not read the file into a more
suitable in-memory data structure and search from there?

Stefan


More information about the XML-SIG mailing list