[XML-SIG] minidom w/ HTML

jennyw jennyw at colorfulexpressions.com
Mon Jun 21 15:25:59 EDT 2004


I have a project where I need to parse html files that are table heavy 
(a calendar, actually), and I thought minidom would be perfect for my 
needs. The problem is that the HTML that I'm trying to parse isn't quite 
valid XML -- mostly minor things, but enough so that minidom won't work. 
  Is there a something that would convert an html file into XML that 
would work with minidom? Or is there something better, like something 
more geared towards html that I should be looking at?

The reason I thought of minidom is because I want to easily be able to 
navigate through table cells. Basically, it's a weekly calendar, and 
there's a table that has cells for each day. Inside each day cell, there 
are cells for time and for the name of the event. There are other ways 
to do this, but I'd like to learn more about parsing XML documents and 
thought this would be a good way accomplish my immediate needs and learn 
something new.

Thanks!

Jen




More information about the XML-SIG mailing list