[XML-SIG] How to build a DOM from an HTML file?
Mon, 26 Feb 2001 16:35:08 +0100 (CET)
I'm trying to parse HTML documents into DOMs, using the 4DOM version that
comes with 4Suite 0.10.2
I first tried xml.dom.ext.reader.HtmlSax.HtmlDomGenerator with a
xml.dom.ext.reader.Sax.Reader but it seems to be broken (see
bug #404072). Then I tried xml.dom.ext.reader.HtmlLib.FromHmlUrl which
uses the Sgmlop parser. However, this parser looks only partially
implemented (it chokes on doctype directives, for example, which means
that pages which probably contain the most valid HTML won't be parsed).
What is the current prefered way to do this ?
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).