history xml file parse help

a64bs4$1oo$1@newsreader.mailgate.org eugene1977 at hotmail.com
Wed Jul 17 02:47:34 EDT 2002


hi
i'm totally newbie to python/xml
(just reading first book of those)

i'd like to parse a history file(galeon browser)
and feed into database
and crawl each url and save it in database also..
so that i can do search on it..

well..
this history file looks very easy to parse..
but i couldn't find a working sample code..

can anyone help me plz?
thank you

<?xml version="1.0"?>
<history>
  <host name="www.jspinsider.com" zoom="110"/>
  <host name="www.linuxquestions.org" zoom="60"/>
  <item title="LinuxQuestions.org Forums - where Linux newbies come for help 
- Reply to Topic" url=\
"http://www.linuxquestions.org/questions/newreply.php?s=&action=newreply&threadid=25299" 
fi\
rst_time="1026115910" last_time="1026177238" visits="2"/>
  <item title="2.3.2 Configuring Apache" 
url="http://www.modpython.org/live/mod_python-2.7.8/doc-ht\
ml/inst-apacheconfig.html" first_time="1026704942" last_time="1026707281" 
visits="2"/>
  <item title="Yahoo! Media Helper" 
url="http://mediaframe.yahoo.com/detect/lite/firewall.html?.os=\
Unix&.osv=Linux&.br=Netscape&.bv=5&.java=true&.tz=38&.sh=1024&.sv=768&a\
mp;.yp=false&.wm=false&.rn=false&.qt=false&.fl=2.0&.rate=64&.ct=false&.\
intl=en&.done=/launch%3flid=wmv-100-s.2715070--137369,wmv-300-s.2715072--137370,rnv-100-s.27150\
73--137371,rnv-300-s.2715075--137372,wmv-56-s.2715100--137373,rnv-56-s.2715101--137374%26p=fifa%26c\
=onepane%26f=90353079%26.intl=en%26.small=1%26.close=1%26.ti=http%3a//us.i1.yimg.com/us.yimg.com/i/\
fifa/gen/pf/gotc_banner1.jpg%26.bi=http%3a//us.i1.yimg.com/us.yimg.com/i/fifa/gen/pf/gotc_banner2.j\
pg" first_time="1026672324" last_time="1026672324" visits="1"/>
  <item title="404 Missing File" url="http://www.ucc.ie/doc/emacs.html" 
first_time="1026145090" las\
t_time="1026145090" visits="1"/>
  <item title="Click here to find out more!" 
url="http://ad.doubleclick.net/adi/devshed.dart/spytho\
n;sz=468x60;tile=1;ord=315679626?" first_time="1026785660" 
last_time="1026785660" visits="1"/>
  <item title="DevShed - Re: Where does PHP come in?" 
url="http://www.devshed.com/Server_Side/Pytho\
n/CGI/comments/936159334/936207230/937094105/938110165/950034596" 
first_time="1026785775" last_time\
="1026785775" visits="1"/>
  <item title="DevShed - Playing It Again...And Again...And Again..." 
url="http://www.devshed.com/S\
erver_Side/Python/Python101/Python101_2/page5.html" first_time="1026768945" 
last_time="1026768945" \
visits="1"/>
  <item title="Untitled" 
url="http://media.fastclick.net/w/get.media?sid=2478&m=1&d=s&v\
=1.0d&pageid=66088" first_time="1026476388" last_time="1026476388" 
visits="1"/>
</history>






More information about the Python-list mailing list