[XML-SIG] State of the world

Andrew Kuchling akuchlin@cnri.reston.va.us
Wed, 29 Apr 1998 09:42:10 -0400 (EDT)


Lars M. is away from his regular account, but here's a response from
him that I'm forwarding to the list.

Andrew Kuchling wrote:
> 
>         * On the xml-dev mailing list, David Megginson's Java SAX
> implementation is now at 1.0beta, and the interface has been frozen
> except for bug fixes.  Once it goes final, the Python SAX interface
> can be modified to match the frozen interface, 

I'm already working on the Python version, so I expect it to be out 
be out quite soon.

> (Well, there will probably be a level 2 SAX
> interface someday, but that's no great concern at the moment.)

Agreed. I'll probably make some experimental extensions (clearly 
marked as such), but they will probably be released a little later.
 
>         * In the String-SIG, Martin von Loewis posted another patch
> that adds Unicode to the Python core.  I've been meaning to take a
> look at it, but haven't got around to it yet, so work on Unicode is
> still progressing, though not very quickly.

I had a quick look at your proposal and had a question I never got
round to asking: will the chr function be able to create characters
like chr(2472)?

IMHO it should, both for convenience and because it's a very natural
thing to expect.
 
> My inclination is
> that individual authors such as Lars and Stefane will always
> distribute their code as single pieces, but there will also be an
> omnibus package that contains everything -- SAX, DOM, xmltok, JPython
> code, documentation, demo programs, and anything else we can think of.
> Most users will install this package.  I'm willing to do that
> packaging job.

I think this is the way to go. I've already installed the the different
packages according to the dir structure we agreed on and it looks good.
 
>         * Also, we need a single factory function for instantiating
> XML parsers, that will use xmltok if it's available, the appropriate
> Java parser in JPython, and xmllib if there's nothing more specialized
> installed.

I'm planning to add my parser factory proposal as an extension to SAX.
The way I plan to do it the parser creation method will first try expat,
then xmlproc and then xmllib, but this order can be changed by the user.
 
>         * xmltok seems to have changed names, to expat.  Probably
> the Python extension should follow suit.

And be updated. :)

A SAX driver for this is planned, although I may not be able to actually
do it until I return to Norway in a couple of weeks.
 
>         * We need to come to some resolution about handling multiple
> XML documents coming from a single input source. (This is the problem
> I ran into with xml.marshal, which prevents the code from marshalling
> two Python objects to the same file and then reading them in again.)

Actually, this may be problematic. See

http://www.xml.com/axml/notes/TrailingMisc.html

for Tim Brays comment on this.

I think the way around it would be to have every document start with
an XML declaration. That will make conforming parsers throw an error
when the new document starts, which can be (maybe with a little 
extension to the parsers) be caught and used to trigger some code that
makes the parser consider the rest of the stream a new document.
 
> Anything else we need to consider?

Not that I can think of. You've covered all my worries, at least. More
software would be nice, but I guess that will appear in due time.

As for the Perl XML effort we are definitely ahead. What they have so
far is an equivalent of PyXMLTok and a non-standard grove builder (ie:
a DOM equivalent). 

--Lars M.