Parsing HTML, extracting text and changing attributes.

Jay Loden jloden at jayloden.com
Mon Jun 18 13:16:57 EDT 2007


Stefan Behnel wrote:
> Jay Loden wrote:
>> Someone else mentioned lxml but as I understand it lxml will only work if
>> it's valid XHTML that they're working with.
> 
> No, it was meant as the OP requested. It even has a very good parser from
> broken HTML.
> 
> http://codespeak.net/lxml/dev/parsing.html#parsing-html

I stand corrected, I missed that whole part of the LXML documentation :-)



More information about the Python-list mailing list