Trouble with htmllib.HTMLParser

Jeremy Fincher jfincher at Tweed.kitenet.net
Sun Nov 12 05:27:06 EST 2000


Pardon me if I'm overlooking some obvious resource (and please point me
to it :))

I've used HTML parsing libraries in other languages (read: Perl) and
I've always simply inherited from an HTML Parsing class, and overridden
the functions that interest me.  I'm not having as easy a time in
python; one thing I've have particular trouble with in reading the
documentation for htmllib.HTMLParser is finding out how CDATA (ie, the
stuff between the start and end tags) is passed to my class.

Do I have to use a formatter with HTMLParser?  I'm not planning on
actually outputting anything; it's mostly to enter information into a
database.

Are there any resources/example code other than the Library Reference?
I haven't been able to find any.

Thanks.

Jeremy



More information about the Python-list mailing list