Python not a Very High-Level Language?

Tim Peters tim_one at email.msn.com
Tue Jan 4 01:13:58 EST 2000


[Tres Seaver]
> What reasonable use can you propose for an associative
> container which allows mutable keys?  You might as well revert
> to a list of (key,value) pairs, since you're going to do a
> linear search on each lookup anyway if the keys are allowed to
> change.

[Neel Krishnaswami]
> You can compute the hash function using the object's identity,
> rather than its value. IOW, hash on the object's address. Then
> you can mutate the contents of the object as needed and still
> access the hash. In fact, I believe this is the default behavior
> for Common Lisp hash tables (though of course you can create hash
> tables with different hash functions).
> ...

[Gordon McMillan]
> Well, you could, but then you have to pass around the one-
> and-only key to access the value. Seems more sensible, as
> Tres said, to pass around (key, value) instead and just forgo
> the hash.

Na, there are lots of uses for this.

Simple one:  you have a list of lists (or of any other kind of mutable
object) and want to weed out duplicates.  If it were a list of (say) ints,

def weed(bag):
    temp = {}
    for x in bag:
        temp[x] = 1
    return temp.keys()

is obvious and fast.  The same algorithm (although not this Python spelling
of it) is safe & effective for any kind of objects with hash functions that
don't themselves mutate the objects.

Fancy one:  In state space searches where the operators are reversible
and/or the objects have exploitable symmetries, the natural way to detect
cycles and reduce redundant search is to make the next-state generator
return canonical representatives (kinda like interning a string in Python
today).  Although Python won't let you spell it this way directly, keeping a
dict of representatives (viewed as a set) is the "obvious and fast" way to
do that in the generator.

Outside the generator, in this case you are indeed passing around "the
one-and-only key", so info can be stored in the object; but you still want
to do stuff like e.g. associate info with *pairs* of objects via

    dict[obj1, obj2] = info

notation.  I have yet to find a really slick way to do that in Python.  For
purposes of the generator, I wrap stuff in classes with

    def __hash__(self):
        return something based on all the \
            relevant members of self

    def __cmp__(self, other):
        likewise

but *outside* the generator wrap in classes with

    def __hash__(self):
        return id(self)

    def __cmp__(self, other):
        return cmp(id(self), id(other))

The problem is that these two views of the objects aren't that sharply
separated in real life, and stupid mistakes are common.

The "proper" ways to hash and compare really depends on the use to which
each dict is being put, not on the objects therein.

just-because-you-*can*-mutate-something-doesn't-mean-you-
    will-as-all-your-worst-habits-can-testify<wink>-ly y'rs  - tim






More information about the Python-list mailing list