[XML-SIG] sgmlop and html parsing
walter at livinglogic.de
Wed Jan 14 14:32:56 EST 2004
Martin v. Löwis wrote:
> Walter Dörwald wrote:
>> Wouldn't it make sense to implement an SGMLParser that supports
> No. In SGML, the SGML declaration defines the document encoding, e.g.
> So to understand a character reference, you have to know the SGML
> declaration. It is Unicode only if the declaration says
At least it would help for parsing HTML. Setting the encoding
attribute to None would return 8bit strings from the parser,
so it's the job of the application to decode them.
More information about the XML-SIG