[Tutor] Simple XML parsing

Kent Johnson kent37 at tds.net
Fri Dec 16 12:26:02 CET 2005


Tim Wilson wrote:
> Hi everyone,
> 
> I've got a little project that requires me to parse a simple XML  
> file. The file comes from the browser history of Apple's Safari and  
> includes the URL that was visited, the title of the Web page, the  
> date and time it was last visited, and the total number of times it  
> was visited. Here's a snippet:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http:// 
> www.apple.com/DTDs/PropertyList-1.0.dtd">
> <plist version="1.0">
<snip>
> </plist>
> 
> Let's say that instead of one <dict>, the xml file had 100 of them. I  
> want to generate a simple table of URLs and the dates they were last  
> visited. I can handle everything except parsing the XML file and  
> extracting the information.

These look promising:
http://online.effbot.org/2005_03_01_archive.htm#elementplist
http://www.shearersoftware.com/software/developers/plist/

though the empty <key></key> for the URL might be a problem. The effbot version could be 
changed to
     "dict": lambda x:
         dict((x[i].text or 'url', x[i+1].text) for i in range(0, len(x), 2)),

to change empty keys to 'url'; as long as there is only one per <dict> (and it is actually 
the url) that will work.

If these don't work you can use ElementTree to do the parse and walk the results tree 
yourself to pull out the data.

Kent



More information about the Tutor mailing list