Looking for a specific html parser

Davor Cengija dcengija_remove_ at inet.hr
Tue Mar 18 03:07:47 EST 2003


Rene Pijlman wrote:

> Davor Cengija:
>>I need to pull out some html elements with its subelements from an html
>>document. Is there something already available?
> 
> http://www.python.org/doc/current/lib/module-HTMLParser.html
> 

Yes, I am aware of HTMLParser (I thought I mentioned it in my post, but I 
guess I deleted that sentence). I was hoping something like the described 
parser is already available on the net. Basically, I need a DOM like parser 
for HTML, with xpath capabilities. xml.dom might help me, but before that I 
obviously need some kind of html-tidy. Is the famous Tidy from w3c ported 
to python? I found mxTidy which is basically a wrapper around Tidy, and not 
a native implementation.

Thanks
-- 
Davor Cengija, dcengija_remove_ at inet.hr




More information about the Python-list mailing list