[XML-SIG] Developer's Day

Paul Prescod paul@prescod.net
Sun, 19 Dec 1999 03:20:08 -0600


"Andrew M. Kuchling" wrote:
> 
> Paul Prescod writes:
> >"You told us to use Python for this million dollar system but halfway
> >through its second day of operation someone fed us a well-formed XML
> >document that crashed it."
> 
> No FUD, please.  Are there valid XML documents that xmllib chokes on?

Sure. Lots.

<!DOCTYPE slideshow [
    <!ENTITY foo "abc">
]>
<slideshow>

    &foo;
</slideshow>

Support for this construct is not optional. And there are other, similar
constructs that xmllib does not support. Another (well known, not xmllib
specific) issue is Unicode.

> Have these bugs been reported?

I didn't see it as a bug. xmllib wasn't designed to be a full-fledged
XML parser. It ignores the whole DOCTYPE which makes it impossible to
conform to the XML spec. Had I pointed out that obvious fact through a
series of bug reports it would have been interpreted (even by me) as the
disingenuous rantings of an XML elitist. I always saw xmllib as a
stop-gap until we finished real XML parsers like xmlproc and pyexpat.
xmllib's virtue was that it came out incredibly early in the XML
standard's development process. Python was probably the first language
to support XML in some form in its standard library. Now I want to push
on and keep improving that legacy. Let's get at least Unicode, a
validating XML parser, SAX and DOM in there.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html