Question: processing HTML, re-write default processing action of many tags
aleaxit at yahoo.com
Fri Sep 17 10:56:34 CEST 2004
Hubert Hung-Hsien Chang <hubert at cs.nyu.edu> wrote:
> I know you could use the
> def start_a
> def end_a
> to process the <a href=...> anchor </a> tags, but is there a
> default method for processing ALL tags? If I just want change
> some parts of the hyperlink and want to keep other parts of the HTML
> could I just print them out? There should be such a method.
> Can't find it...
You could subclass HTMLParser.HTMLParser and override handle_starttag
and handle_endtag (also, if needed, handle_charref, handle_entityref,
and last but not least handle_data -- that's assuming that while you
only talk about processing _tags_ you may in fact also want to process
references and text nodes... possibly handle_comment, too, btw).
More information about the Python-list