sgmllib.py not good at handling <br/>

David LeBlanc whisper at oz.nospamnet
Tue May 15 05:32:54 CEST 2001


In article <9dovbc01e33 at news2.newsguy.com>, aleaxit at yahoo.com says...
> "Chris Withers" <chrisw at nipltd.com> wrote in message
<snip>
> > So is SGML a subset of XML?
> 
> No!  The reverse.  But sgmllib does NOT cover all of SGML
> (not even any _substantial_ fraction of it: SGML is really
> huge, which is why it was subsetted to produce XML!-), just
> what little of it is needed to parse typical HTML, as
> the library reference manual says.
> 
> 
> Alex

Well, technically speaking, XML is an _aspect_ of SGML, while HTML is an 
_application_ of SGML. (In order to make XML a proper subset of SGML, 
the SGML standard itself was modified a bit - sorry, don't recall all the 
details off-hand any more.)

XML is a fairly substantial subset of SGML which removed (imho somewhat 
too much) elements of SGML that made it very difficult to implement; for 
example, tag minimization and omitted tags which made for ambiguous 
parsing among other ills and which also infected HTML. (A couple of 
examples where I think they went too far is with removing conditionals 
and some content types.)

XML is turning out to be the "killer application" (actually subset) for 
SGML since it's made markup languages vastly more popular and widespread 
then SGML ever was, again due to XML's vastly easier implementation and 
thus lower cost. (Some claim that there is not yet a 100% conforming 
implementation of SGML, and the equivelent of XSL-FO, DSSSL, for sure has 
never been 100% implemented due to it's byzantine complexity and 
(imho) reliance on Lisp.)

Oops, we now return you to your regularly scheduled Python...

Dave LeBlanc



More information about the Python-list mailing list