trying to parse non valid html documents with HTMLParser

florent florent.newsgroups at
Wed Aug 3 12:12:01 CEST 2005

>  From
>     You didn't write that awful page. You're just trying to get
>     some data out of it. Right now, you don't really care what
>     HTML is supposed to look like.
>     Neither does this parser.

True, I just want to extract some data from html documents. But the 
problem is the same. The parser looses the position he was in the string 
when he encounters a bad tag.

More information about the Python-list mailing list