trying to parse non valid html documents with HTMLParser
florent.newsgroups at kynesthesy.org
Wed Aug 3 12:12:01 CEST 2005
> From http://www.crummy.com/software/BeautifulSoup/:
> You didn't write that awful page. You're just trying to get
> some data out of it. Right now, you don't really care what
> HTML is supposed to look like.
> Neither does this parser.
True, I just want to extract some data from html documents. But the
problem is the same. The parser looses the position he was in the string
when he encounters a bad tag.
More information about the Python-list