[XML-SIG] SAX 2.0, again

THOMAS PASSIN tpassin@idsonline.com
Tue, 22 Feb 2000 21:29:25 -0500


Lars Marius Garshol wrote, replying to my post:

<snip/>
> | The application shouldn't have to figure out the structure before it
> | can even extract the value.  So I don't think the xml name should be
> | a tuple if it has a declared namespace but a string if there is no
> | namespace.
>
> This is a valid point, unless we can work around the problem somehow.
>
> | With this in mind, how about
> |
> | ((prefix,localpart),uri)
>
> For performance and convenience it would be better to do this as
>
>   (prefix, localpart, uri)
>
> but I agree that this is better than
>
>   (uri, localpart, rawname)
>
> since you rarely want the rawname anyway, and when you want it you can
> get it from the prefix + localpart.
>
> The only problem I have with this is that it means that names with
> different prefixes do not compare as equal. This is why I would prefer
> to have the prefix reported somewhere else. (Any good ideas for where?)
>
OK, what about (prefix,(localpart,uri)).  Then we compare  names with
names_compare=(name1[1]==name2[1]).  Since names are the same by definition
if the localpart and namespace are identical, this should work fine.  And
the prefix is still there, tagging along for the ride.  As for performance,
you know far more about Python performance than I.  But maybe some
analysis... say we are processing 10,000 elements using SAX with some
typical kind of element processing methods.  What fraction of the total
processing time would be lost by using this structure and name test instead
of some optimized structure?  If the loss might be, say, 5%, I say don't
worry about it one little bit.  If it's 25% of the ***overall*** processing
time, probably that is too much.

Who can shed some reasonably definitive light on this?

Regards,

Tom Passin