preferring [] or () in list of error codes?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Tue Jun 9 19:40:04 EDT 2009


On Tue, 09 Jun 2009 04:57:48 -0700, samwyse wrote:

> Time to test things!   I'm going to compare three things using Python
> 3.0:
>   X={...}\nS=lambda x: x in X
>   S=lambda x: x in {...}
>   S=lambda x: x in (...)
> where the ... is replaced by lists of integers of various lengths.
> Here's the test bed:

[snip]

Hmmm... I think your test-bed is unnecessarily complicated, making it 
difficult to see what is going on. Here's my version, with lists included 
for completeness. Each test prints the best of five trials of one million 
repetitions of ten successful searches, then does the same thing again 
for unsuccessful searches.


from timeit import Timer

def test(size):
    global s, l, t, targets
    print("Testing search with size %d" % size)
    rng = range(size)
    s, l, t = set(rng), list(rng), tuple(rng)
    # Calculate a (more or less) evenly distributed set of ten 
    # targets to search for, including both end points.
    targets = [i*size//9 for i in range(9)] + [size-1]
    assert len(targets) == 10
    setup = "from __main__ import targets, %s"
    body = "for i in targets: i in %s"
    # Run a series of successful searches.
    for name in "s l t".split():
        obj = globals()[name]
        secs = min(Timer(body % name, setup % name).repeat(repeat=5))
        print("Successful search in %s: %f s" % (type(obj), secs))
    # Also run unsuccessful tests.
    targets = [size+x for x in targets]
    for name in "s l t".split():
        obj = globals()[name]
        secs = min(Timer(body % name, setup % name).repeat(repeat=5))
        print("Unsuccessful search in %s: %f s" % (type(obj), secs))

Results are:

>>> test(1)
Testing search with size 1
Successful search in <class 'set'>: 1.949509 s
Successful search in <class 'list'>: 1.838387 s
Successful search in <class 'tuple'>: 1.876309 s
Unsuccessful search in <class 'set'>: 1.998207 s
Unsuccessful search in <class 'list'>: 2.148660 s
Unsuccessful search in <class 'tuple'>: 2.137041 s
>>>
>>>
>>> test(10)
Testing search with size 10
Successful search in <class 'set'>: 1.943664 s
Successful search in <class 'list'>: 3.659786 s
Successful search in <class 'tuple'>: 3.569164 s
Unsuccessful search in <class 'set'>: 1.935553 s
Unsuccessful search in <class 'list'>: 5.833665 s
Unsuccessful search in <class 'tuple'>: 5.573177 s
>>>
>>>
>>> test(100)
Testing search with size 100
Successful search in <class 'set'>: 1.907839 s
Successful search in <class 'list'>: 21.704032 s
Successful search in <class 'tuple'>: 21.391875 s
Unsuccessful search in <class 'set'>: 1.916241 s
Unsuccessful search in <class 'list'>: 41.178029 s
Unsuccessful search in <class 'tuple'>: 41.856226 s
>>>
>>>
>>> test(1000)
Testing search with size 1000
Successful search in <class 'set'>: 2.256150 s
Successful search in <class 'list'>: 189.991579 s
Successful search in <class 'tuple'>: 187.349630 s
Unsuccessful search in <class 'set'>: 1.869202 s
Unsuccessful search in <class 'list'>: 398.451284 s
Unsuccessful search in <class 'tuple'>: 388.544178 s




As expected, lists and tuples are equally as fast (or slow if you 
prefer). Successful searches are about twice as fast as unsuccessful 
ones, and performance suffers as the size of the list/tuple increases. 
However, sets are nearly just as fast no matter the size of the set, or 
whether the search is successfully or unsuccessful.



> You will note that testing against a list constant is just as fast as
> testing against a set.  This was surprising for me; apparently the
> __contains__ operator turns a tuple into a set.

I doubt that very much.


-- 
Steven



More information about the Python-list mailing list