[Expat-discuss] parsing error

Nikolai Koudelia nikoudel at gmail.com
Mon Oct 29 13:42:13 CET 2007


That makes sense :)
Thanks for a quick reply!

2007/10/29, Fred Drake <fdrake at acm.org>:
> On Oct 29, 2007, at 6:43 AM, Nikolai Koudelia wrote:
> > The problem is that the material may not be correct. It may look
> > like this:
> ...
> > When expat parser reaches </brokentag>, it throws an exception and
> > stops parsing. Is there a way to handle situation like that? Some
> > option telling expat to skip broken closing tags? Or should I repair
> > the material before parsing? Last one could be quite tricky, because
> > expat could not be used for that... Any ideas?
>
> XML parsers aren't forgiving the way HTML parsers should be, and
> that's a specific goal.  If you're interested in tolerating any ol'
> HTML, use an HTML parser.  There area a number of those available in
> Python as well (htmllib, BeautifulSoup, lxml.html).
>
>
>    -Fred
>
> --
> Fred Drake   <fdrake at acm.org>
>
>
>
>


More information about the Expat-discuss mailing list