HTML parsing confusion

Jerry Hill malaclypse2 at
Wed Jan 23 16:33:43 CET 2008

On Jan 23, 2008 7:40 AM, Alnilam <alnilam at> wrote:
> Skipping past html validation, and html to xhtml 'cleaning', and
> instead starting with the assumption that I have files that are valid
> XHTML, can anyone give me a good example of how I would use _ htmllib,
> HTMLParser, or ElementTree _ to parse out the text of one specific
> childNode, similar to the examples that I provided above using regex?

Have you looked at any of the tutorials or sample code for these
libraries?  If you had a specific question, you will probably get more
specific help.  I started writing up some sample code, but realized I
was mostly reprising the long tutorial on SGMLLib here:


More information about the Python-list mailing list