preferring [] or () in list of error codes?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Tue Jun 9 19:40:04 EDT 2009
On Tue, 09 Jun 2009 04:57:48 -0700, samwyse wrote:
> Time to test things! I'm going to compare three things using Python
> 3.0:
> X={...}\nS=lambda x: x in X
> S=lambda x: x in {...}
> S=lambda x: x in (...)
> where the ... is replaced by lists of integers of various lengths.
> Here's the test bed:
[snip]
Hmmm... I think your test-bed is unnecessarily complicated, making it
difficult to see what is going on. Here's my version, with lists included
for completeness. Each test prints the best of five trials of one million
repetitions of ten successful searches, then does the same thing again
for unsuccessful searches.
from timeit import Timer
def test(size):
global s, l, t, targets
print("Testing search with size %d" % size)
rng = range(size)
s, l, t = set(rng), list(rng), tuple(rng)
# Calculate a (more or less) evenly distributed set of ten
# targets to search for, including both end points.
targets = [i*size//9 for i in range(9)] + [size-1]
assert len(targets) == 10
setup = "from __main__ import targets, %s"
body = "for i in targets: i in %s"
# Run a series of successful searches.
for name in "s l t".split():
obj = globals()[name]
secs = min(Timer(body % name, setup % name).repeat(repeat=5))
print("Successful search in %s: %f s" % (type(obj), secs))
# Also run unsuccessful tests.
targets = [size+x for x in targets]
for name in "s l t".split():
obj = globals()[name]
secs = min(Timer(body % name, setup % name).repeat(repeat=5))
print("Unsuccessful search in %s: %f s" % (type(obj), secs))
Results are:
>>> test(1)
Testing search with size 1
Successful search in <class 'set'>: 1.949509 s
Successful search in <class 'list'>: 1.838387 s
Successful search in <class 'tuple'>: 1.876309 s
Unsuccessful search in <class 'set'>: 1.998207 s
Unsuccessful search in <class 'list'>: 2.148660 s
Unsuccessful search in <class 'tuple'>: 2.137041 s
>>>
>>>
>>> test(10)
Testing search with size 10
Successful search in <class 'set'>: 1.943664 s
Successful search in <class 'list'>: 3.659786 s
Successful search in <class 'tuple'>: 3.569164 s
Unsuccessful search in <class 'set'>: 1.935553 s
Unsuccessful search in <class 'list'>: 5.833665 s
Unsuccessful search in <class 'tuple'>: 5.573177 s
>>>
>>>
>>> test(100)
Testing search with size 100
Successful search in <class 'set'>: 1.907839 s
Successful search in <class 'list'>: 21.704032 s
Successful search in <class 'tuple'>: 21.391875 s
Unsuccessful search in <class 'set'>: 1.916241 s
Unsuccessful search in <class 'list'>: 41.178029 s
Unsuccessful search in <class 'tuple'>: 41.856226 s
>>>
>>>
>>> test(1000)
Testing search with size 1000
Successful search in <class 'set'>: 2.256150 s
Successful search in <class 'list'>: 189.991579 s
Successful search in <class 'tuple'>: 187.349630 s
Unsuccessful search in <class 'set'>: 1.869202 s
Unsuccessful search in <class 'list'>: 398.451284 s
Unsuccessful search in <class 'tuple'>: 388.544178 s
As expected, lists and tuples are equally as fast (or slow if you
prefer). Successful searches are about twice as fast as unsuccessful
ones, and performance suffers as the size of the list/tuple increases.
However, sets are nearly just as fast no matter the size of the set, or
whether the search is successfully or unsuccessful.
> You will note that testing against a list constant is just as fast as
> testing against a set. This was surprising for me; apparently the
> __contains__ operator turns a tuple into a set.
I doubt that very much.
--
Steven
More information about the Python-list
mailing list