multi-Singleton-like using __new__
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Fri Feb 8 19:36:48 EST 2008
On Sat, 09 Feb 2008 00:04:26 +0000, Matt Nordhoff wrote:
> At worst, in and has_key are "about the same".
Except that using has_key() means making an attribute lookup, which takes
time.
I'm kinda curious why you say they're "about the same" when your own
timing results contradict that. Here they are again, exactly as you
posted them:
$ python -m timeit -s "d = dict.fromkeys(xrange(5))" "4 in d"
1000000 loops, best of 3: 0.233 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(5))" "d.has_key(4)"
1000000 loops, best of 3: 0.321 usec per loop
For a small dict, a successful search using in is about 1.3 times faster
than using has_key().
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))" "499999 in d"
1000000 loops, best of 3: 0.253 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))"
"d.has_key(499999)"
1000000 loops, best of 3: 0.391 usec per loop
For a large dict, a successful search using in is about 1.5 times faster
than using has_key().
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))" "1000000 in d"
1000000 loops, best of 3: 0.208 usec per loop
$ python -m timeit -s "d = dict.fromkeys(xrange(500000))"
"d.has_key(1000000)"
1000000 loops, best of 3: 0.324 usec per loop
For a large dict, an unsuccessful search using in is also about 1.5 times
faster than using has_key().
Or, to put it another way, has_key() takes about 40-60% longer than in.
Now, if you want to argue that the difference between 0.3 microseconds
and 0.2 microseconds is insignificant, I'd agree with you -- for a single
lookup. But if you have a loop where you're doing large numbers of
lookups, using in will be a significant optimization.
--
Steven
More information about the Python-list
mailing list