[XML-SIG] XPath's reliance on id()
Martin v. Loewis
martin@v.loewis.de
14 Mar 2002 18:43:12 +0100
Martijn Faassen <faassen@vet.uu.nl> writes:
> > ??? x,y ??? Python-Object: x = y ??? hash(x) = hash(y)
>
> This seems to have gotten somewhat mangled here; I get three question
> marks and this must've been some symbol?
This was an attempt to put FOR ALL, ELEMENT OF, and RIGHT ARROW into
an email message; it seems it failed.
> Anyway, perhaps the notion of equality is what we need; in my mind two
> objects can stand in for the same DOM node but not be the same object;
> they're equal but not identical.
Strictly speaking, the DOM spec does not guarantee equality of nodes.
If anything, it guarantees that identity works.
> The notion for equality in DOM nodes is actually supported by the
> DOM level 3 working draft:
>
> """
> isSameNode (introduced in DOM Level 3)
It is the notion of "sameness" that is supported. The Python mapping
could mandate that == for nodes holds iff isSameNode holds, but it
currently doesn't.
Notice that they also have isEqualNode; this is *not* what we want.
> Then again, I just found out they have a compareTreePosition()
> method added to the Node interface that we could use for sorting
> purposes, I think..
Indeed. Then it would be up to the DOM implementation to make that
happen. This sounds like the cleanest approach to me.
> But that is in fact what is needed in this case; I have many different
> proxy objects which may all map to the same actual DOM node, so they'd
> have the same __hash__. But perhaps the other implications of __hash__
> break that. What about supplying a 'key' attribute, anticipating DOM
> level 3 vague implications? :)
If we mandate DOM3 features, I think we should use the feature that
apparently was explicitly added for XSLT document order:
compareTreePosition.
> I don't think it's reasonable to give those inner nodes the same
> hash value at all. They're not the same node, and shouldn't hash the
> same way.
They are equal nodes (in the sense of isEqualNode), so I see no reason
why the hashes should be different. If I was to implement a hash of a
node, I'd use the formula
def hash(node):
res = hash(node.nodeType)+hash(node.nodeName)
for c in node.childNodes:
res += hash(c)
return res
> I don't see any reason to make two different nodes hash the same way just
> because they have the same name.
They have the same name, the same type, and the same content. They
really are equal.
Regards,
Martin