Parsing broken HTML via Mozilla
-$P-W$- at noctua.org.uk
Tue Aug 10 22:38:00 CEST 2004
In article <mailman.1413.1092080863.5135.python-list at python.org>, Walter
> I'm trying to parse broken HTML with several Python tools.
> Unfortunately none of them work 100% reliable. Problems are e.g.
> nested comments, bare "&" in URLs and "<" in text (e.g. "if foo <
> bar") etc.
Not a Mozilla solution, but I hear good things about
Paul Wright | http://pobox.com/~pw201 | http://blog.noctua.org.uk/
Reply address is valid but discards mail with attachments: send plain text only
More information about the Python-list