multi-Singleton-like using __new__

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sat Feb 9 01:36:48 CET 2008


On Sat, 09 Feb 2008 00:04:26 +0000, Matt Nordhoff wrote:

> At worst, in and has_key are "about the same".

Except that using has_key() means making an attribute lookup, which takes 
time.

I'm kinda curious why you say they're "about the same" when your own 
timing results contradict that. Here they are again, exactly as you 
posted them:


$ python -m timeit -s "d = dict.fromkeys(xrange(5))" "4 in d"
1000000 loops, best of 3: 0.233 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(5))" "d.has_key(4)"
1000000 loops, best of 3: 0.321 usec per loop

For a small dict, a successful search using in is about 1.3 times faster 
than using has_key().


$ python -m timeit -s "d = dict.fromkeys(xrange(500000))" "499999 in d"
1000000 loops, best of 3: 0.253 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))"
"d.has_key(499999)"
1000000 loops, best of 3: 0.391 usec per loop

For a large dict, a successful search using in is about 1.5 times faster 
than using has_key().


$ python -m timeit -s "d = dict.fromkeys(xrange(500000))" "1000000 in d"
1000000 loops, best of 3: 0.208 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))"
"d.has_key(1000000)"
1000000 loops, best of 3: 0.324 usec per loop

For a large dict, an unsuccessful search using in is also about 1.5 times 
faster than using has_key().


Or, to put it another way, has_key() takes about 40-60% longer than in.

Now, if you want to argue that the difference between 0.3 microseconds 
and 0.2 microseconds is insignificant, I'd agree with you -- for a single 
lookup. But if you have a loop where you're doing large numbers of 
lookups, using in will be a significant optimization.



-- 
Steven



More information about the Python-list mailing list