sgmllib.py not good at handling <br/>
whisper at oz.nospamnet
Tue May 15 05:32:54 CEST 2001
In article <9dovbc01e33 at news2.newsguy.com>, aleaxit at yahoo.com says...
> "Chris Withers" <chrisw at nipltd.com> wrote in message
> > So is SGML a subset of XML?
> No! The reverse. But sgmllib does NOT cover all of SGML
> (not even any _substantial_ fraction of it: SGML is really
> huge, which is why it was subsetted to produce XML!-), just
> what little of it is needed to parse typical HTML, as
> the library reference manual says.
Well, technically speaking, XML is an _aspect_ of SGML, while HTML is an
_application_ of SGML. (In order to make XML a proper subset of SGML,
the SGML standard itself was modified a bit - sorry, don't recall all the
details off-hand any more.)
XML is a fairly substantial subset of SGML which removed (imho somewhat
too much) elements of SGML that made it very difficult to implement; for
example, tag minimization and omitted tags which made for ambiguous
parsing among other ills and which also infected HTML. (A couple of
examples where I think they went too far is with removing conditionals
and some content types.)
XML is turning out to be the "killer application" (actually subset) for
SGML since it's made markup languages vastly more popular and widespread
then SGML ever was, again due to XML's vastly easier implementation and
thus lower cost. (Some claim that there is not yet a 100% conforming
implementation of SGML, and the equivelent of XSL-FO, DSSSL, for sure has
never been 100% implemented due to it's byzantine complexity and
(imho) reliance on Lisp.)
Oops, we now return you to your regularly scheduled Python...
More information about the Python-list