[XML-SIG] Corrected list of packages handling XML 1.1
Uche Ogbuji
Uche.Ogbuji at fourthought.com
Thu Sep 1 19:59:09 CEST 2005
On Thu, 2005-09-01 at 12:50 +0200, Walter Dörwald wrote:
> Ken Beesley wrote:
>
> > My apologies to Fredrik Lundh of Pythonware for the omission of
> > ElementType+sgmlop in my recent listing of Python-XML packages that
> > handle XML 1.1. The list (that I'm aware of) currently includes: 1.
> > pxdom by Andrew Clover (http://www.doxdesk.com/software/py/pxdom.html,
> > http://www.doxdesk.com/file/software/py/pxdom.py) 2. pyLTXML from the
> > Univ. of Edinburgh (http://www.ltg.ed.ac.uk/software/xml,
> > http://www.ltg.ed.ac.uk/software/gpl_xml.html,
> > http://www.ltg.ed.ac.uk/software/xml/xmldoc/xmldoc.html) 3. elementtree
> > library from Pythonware (http://effbot.org/zone/element.htm,
> > http://effbot.org/zone/element-index.htm) If I've forgotten anyone,
> > please help me complete the list.
> > [...]
>
> XIST (http://www.livinglogic.de/Python/xist) handles XML 1.1 charrefs
> when a parser is used that does it. (XIST uses sgmlop by default, so it
> works by default). When serializing XML those charrefs are always
> supported. See the following snippet:
>
> >>> from ll.xist import parsers, presenters
> >>> from ll.xist.ns import html
> >>> e = parsers.parseString("<body>this is a backspace: </body>")
> >>> print e.asrepr(presenters.CodePresenter())
> ll.xist.xsc.Frag(
> ll.xist.ns.html.body(
> 'this is a backspace: \x08'
> )
> )
> >>> print e.asBytes()
> <body>this is a backspace: </body>
This conversation is really becoming surreal. People, please, it's very
simple: supporting the range of character references defined in XML 1.1.
Is not, repeat *NOT* the same thing as being an XML 1.1 parser.
If I have software that parses "<a>b</a>" that does not mean I have an
XML 1.0 parser. If that software also accepts "<a>b</c>", then it is
obviously not such.
Any software that accepts "<body>this is a backspace: </body>"
is neither a compliant XML 1.0 parser nor a compliant XML 1.1. parser.
All XML 1.1 documents *must have an XML declaration* according to the
strict stipulation of the spec. If an XML 1.1. parser encounters a
document without an XML declaration, it *must* assume that it is an XML
1.0 document, at which point it would *have to* stop with a fatal error
when it encounters . Period. There is no negotiation here.
Therefore, as far as I can tell, neither the ET/sgmlop trick nor XIST
are XML 1.1. parsers. I cannot speak for LTXML or pxdom, but knowing
the authors, I would guess that they are indeed compliant XML 1.1
parsers.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html
XML Output with 4Suite & Amara - http://www.xml.com/pub/a/2005/04/20/py-xml.html
Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/
Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html
More information about the XML-SIG
mailing list