An example using htmllib?

Dfenestr8 chrisdewinN0SPAM at yahoo.com.au
Sat Nov 8 08:50:28 EST 2003


Hi.

I want a routine that strips a line of html of all it's tags. e.g I want
it to turn ....

"<p><b>This is an <h1><blink>IRRITATING</blink></h1> line of </b>text</p>"

... into ......

"This is an IRRITATING line of text"

I've been told I should use htmllib. I've tried reading the htmllib docs
in the Library Reference, but I have to say, it just confuses me.

Does anyone know of a page that shows some simple examples of the sort of
thing I want to do?

Or, is it possible to use the example provided in the docs to achieve
this? Here's the example below ...


from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):

    def handle_starttag(self, tag, attrs):
        print "Encountered the beginning of a %s tag" % tag

    def handle_endtag(self, tag):
        print "Encountered the end of a %s tag" % tag





More information about the Python-list mailing list