[XML-SIG] SAX in Python 1.6

Lars Marius Garshol larsga@garshol.priv.no
30 Jun 2000 18:26:04 +0200


* Lars Marius Garshol
|
| IMHO we should use the namespace support that is built-in to expat.
| Anything else is bound to slow us down.

* Paul Prescod
| 
| Unfortunately expat's namespaces support is broken from the point of
| view of SAX and DOM. 

I know, but it's much better to simply modify the output from expat
(preferably in C source) than to implement namespaces in Python.
Remember: we have to map from the 'uri localname' to a tuple for every
single tag in the entire XML document. That is going to have an
appreciable performance hit if you implement it in Python no matter
how well you implement it.

If you do something once for every element it has a performance
impact. This is done twice, and it's rather complex.
 
* Lars Marius Garshol
|
| Whoops! parseFile() no longer exists! We now use the InputSource class
| instead. 
 
* Paul Prescod
|
| InputSource seemed like overkill to me. More of a Java-ish type
| safety thing. I'd appreciate your opinion.

This was what I thought initially as well, but it turns out that
InputSource is in fact extremely useful. The trouble is that getting a
stream is not enough in the general case. You need to know the base
URI. You may want to know the public id. You may need to know the
encoding. 

InputSource is very handy in that it bundles all that information in a
single object, making both parse(...) and resolveEntity(...) much more
elegant than they would otherwise be.
 
| In my opinion, parse() should accept a string or a stream. If a
| string, it should be treated as a URL or filename and opened.

Accepting a string is what it does right now. Streams I think should
not be directly accepted, but a convenience function or method for
them is OK.
 
| We will also provide a convenience method parseString() that parses an
| XML string (probably by wrapping it in a cStringIO. 

Sounds good, as did the rest of the mail.

--Lars M.