Turning HTMLParser into an iterator

samwyse samwyse at gmail.com
Mon Jun 1 03:50:03 CEST 2009


I'm processing some potentially large datasets stored as HTML.  I've
subclassed HTMLParser so that handle_endtag() accumulates data into a
list, which I can then fetch when everything's done.  I'd prefer,
however, to have handle_endtag() somehow yield values while the input
data is still streaming in.  I'm sure someone's done something like
this before, but I can't figure it out.  Can anyone help?  Thanks.



More information about the Python-list mailing list