[Python-Dev] Weird use of hash() -- will this work?

Guido van Rossum guido@digicool.com
Thu, 18 Jan 2001 19:52:02 -0500


> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.  
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.

Argh!  hash([]) should raise TypeError, since lists are not hashable
objects -- mutable objects can't be allowed as dictionary keys.  This
(hash([]) accidentally returned a value for a brief period after I
checked in the rich comparisons -- I've fixed that now.

But not to worry: instead of using hash([]), you can use hex(id([])).
Same thing.

On the other hand, remember how much you can do in a millisecond!
(E.g. I can call tempfile.mktemp() 5 times in that time.)  And when
you create an object and immediately delete it, the next object
created is very likely to have the same address.

But what's wrong with this:

    try:
        from thread import get_ident as unique_id
    else:
        def unique_id(): return id([])

--Guido van Rossum (home page: http://www.python.org/~guido/)