[XML-SIG] SAX namespaces discussion status
Lars Marius Garshol
larsga@garshol.priv.no
04 Jul 2000 11:41:55 +0200
I feel a need to summarize where the discussion stands and what needs
to be done, hence this posting. Basically, we have a disagreement on
how namespace names should be represented in SAX 2.0. My feeling is
that since the organization of the API is changing anyway because of
the incorporation into Python 1.6/2.0 we should make sure we have at
least rough consensus now before moving on.
Paul listed four alternatives (the fifth seems to be identical with
#4). Here is my, slightly modified, version of that list. The qname or
prefix discussion we can leave for later, since it is really
orthogonal to the name representation issue.
#1. def startElement( self, (uri, name), qname, attrs ):
When namespace processing is off, (uri, name) is just the raw
name instead.
#2. def startElement( self, (uri,localname, qname), attrs ):
#3. def startElement( self, ((uri, localname), qname), atrs ):
#4. def startElement( self, name, attrs ):
Depending on whether you have turned on namespace processing,
"name" is # either "string" or (uri,localname,qname)
#1 is here the current SAX 2.0 interface and #2 is what Paul
implemented for Python 2.0. As near as I can tell, current positions
are:
- me: #1
- Paul: #2
- Greg: #1 or #3
- Uche: #1, pending further discussion
The reasons I prefer #1 are that
- it collects the logical name (in both the namespace view and the
XML 1.0 view) into a single value, which seems like The Right Thing
to me
- it is easier to understand how to use this API correctly for
novices
- it is easier for programmers who use the SAX 2.0 interface directly.
I do this all the time, and I believe others will do the same, so
for me this is an important consideration.
As near as I can tell, these are Paul's arguments against it:
- it breaks backwards compatibility
- SAX convenience is not important
- performance for higher layers
Below are my responses to his arguments:
I don't think the backwards compatibility argument carries much
weight. Names have changed anyway, and in rewriting the code adapting
the startElement / endElement methods is very little work. At least
it was for me, and I've rewritten heaps of example code for my book
for just this.
I think SAX convenience matters, but I agree that convenience
arguments carry less weight. However, to me this is also a matter of
rightness. In the namespace view, element names consist of two parts:
URI and local name. The #1 representation reflects that very clearly,
while #2 obscures it.
Performance does of course matter, but I don't see how #2 improves it.
The necessary information is available in both #1 and #2, and access
to it is more or less identical. If the problem is that extracting
the information from the Attributes interface is too slow, then let us
look into what is needed and see how we can best provide that.
Hoping to settle this issue once and for all,
--Lars M.