frozenset question

Steven D'Aprano steve at REMOVETHIScyber.com.au
Wed Jul 6 14:48:12 CEST 2005


On Wed, 06 Jul 2005 11:30:14 +0100, Will McGugan wrote:
 
> I was wondering if frozenset was faster or more efficient in some way. 
> 
> Thinking back to the dark ages of C++, you could optimize things that
> you knew to be constant.

Why would you want to?

py> import sets
py> import time
py> bigset = sets.Set(range(500000))
py> bigimmutableset = sets.ImmutableSet(range(500000))
py> assert len(bigset) == len(bigimmutableset)
py> 
py> def tester(S, L):
...     """Test if items from L are in S, and time the process."""
...     t = time.time()
...     for i in range(100):
...         for item in L:
...             item in S
...     return time.time() - t  # time returned is for 100 loops
...
py>

Time some successful tests:

py> tester(bigset, range(100, 500))
0.11539506912231445
py> tester(bigimmutableset, range(100, 500))
0.12014198303222656

Practically no difference when doing 100*400 checks of whether an integer
is in the set. But let's try again, just in case:

py> tester(bigset, range(100, 500))
0.10998892784118652
py> tester(bigimmutableset, range(100, 500))
0.11114096641540527

The difference is insignificant. How about unsuccessful checks?

py> tester(bigset, range(-100, -500, -1))
0.12070298194885254
py> tester(bigset, range(-100, -500, -1))
0.11681413650512695
py> tester(bigimmutableset, range(-100, -500, -1))
0.11313891410827637
py> tester(bigimmutableset, range(-100, -500, -1))
0.11315703392028809

There is no significant speed difference between immutable and mutable
sets, at least for queries. Regardless of whether it is successful or
unsuccessful, mutable or immutable, it takes about 0.0000025 second to do
each test of item in set. Why would you need to optimize that?

If you tell us what you are trying to do, and under what circumstances it
is too slow, we'll see if we can suggest some ways to optimize it.

But if you are just trying to optimize for the sake of optimization,
that's a terrible idea. Get your program working first. Then when it
works, measure how fast it runs. If, and ONLY if, it is too slow, 
identify the parts of the program that make it too slow. That means
profiling and timing. Then optimize those parts, and nothing else.

Otherwise, you will be like the car designer trying to speed up his sports
cars by making the seatbelts aerodynamic.


-- 
Steven.





More information about the Python-list mailing list