html parsing? Or just simple regex'ing?
Diez B. Roggisch
deetsNOSPAM at web.de
Wed Nov 10 17:58:13 EST 2004
> But if I use an XML parser to parse HTML instead of a dedicated HTML
> parser, will I still get smart handling of unpaired tags? I'm not sure we
> can count on getting 100% properly formed HTML...
There should be html2dom parsers - after all, extending htmlparser to
generate dom shouldn't be to hard.
Googling turns up tidy - so you may want to feed your html through it
before:
http://www.xml.com/pub/a/2004/09/08/pyxml.html
--
Regards,
Diez B. Roggisch
More information about the Python-list
mailing list