Looking for code which allows easy extraction of text from HTML

Grzegorz Adam Hankiewicz gradha at titanium.sabren.com
Wed Mar 5 05:52:38 EST 2003


Hello.

I need to parse a few HTML pages which contain information. These
pages were generated from a database and thus have a common HTML code
structure. Is there a package which extracts text given a condition?
I would need a re-like module for HTML code. I have thought of
transforming the HTML to XML with HTMLParser and use minidom
to extract the text with a few recursive text node extraction
functions. Is there a better way?





More information about the Python-list mailing list