[XML-SIG] XPath's reliance on id()

Martin v. Loewis martin@v.loewis.de
13 Mar 2002 20:57:46 +0100


Martijn Faassen <faassen@vet.uu.nl> writes:

> > I think this approach (for determining document order) really does
> > need the notion of identity, so it's hard to believe that this could
> > actually work.
>=20
> Doesn't __hash__ imply there is a notion of identity?

No, it implies equality. Actually, it is the other way 'round:
equality implies equal hash values:

=E2=88=80 x,y =E2=88=88 Python-Object: x =3D y =E2=87=92 hash(x) =3D hash(y)

>   If a class does not define a __cmp__() method it should not define a=20
>   __hash__() operation either;
>=20
> I don't understand why this must be so.

That is surprising indeed; I had expected that this it is the other
way 'round: If you implement __cmp__, you also need to implement
__hash__.

> The point is simply that id() is not customizable, and hash() is.

The point is that hash() is tied to cmp() through the dictionary
implementation, and that conceptionally, hash() maps the large set of
objects to the much smaller set of numbers, in a unidirectional way
(i.e. different objects may have equal hash values). Typically, the
hash of an objects is computed from its state.

> > >   * we need to alter the code in xpath/Util.py so as not to rely on
> > >     id() anymore but to use the node objects as hash keys directly.=20
> >=20
> > Then doing what? What happens if you have two nodes that have the same
> > hash value, but which are different?
>=20
> I reconsidered later; I simply want to use hash() instead of id. In that
> case you don't have a problem with two objects having the same hash value
> but are different. If they have the same hash value they're identical.

Not at all. Consider

<outer>
  <inner/>
  <inner/>
</outer>

The two inner elements may reasonably have the same hash value. They
must not be considered identical, since one is before the other in
document order.

Regards,
Martin