clarification
samwyse
dejanews at email.com
Sun Aug 19 08:03:39 EDT 2007
Alex Martelli wrote:
> Of course, hoisting the unbound method out of the loops can afford the
> usual small optimization. But my point is that, in Python, these
> operations (like, say, the concatenation of a sequence of lists, etc)
> are best performed "in place" via loops calling mutator methods such as
> update and intersection_update (or a list's extend, etc), rather than
> "functionally" (building and tossing away many intermediate results).
> E.g., consider:
>
> brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in
> range(0, 100, 3)]' 'r=set()' 'for x in sos: r.update(x)'
> 100000 loops, best of 3: 18.8 usec per loop
>
> brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in
> range(0, 100, 3)]' 'r=reduce(set.union, sos, set())'
> 10000 loops, best of 3: 87.2 usec per loop
>
> Even in a case as tiny as this one, "reduce" is taking over 4 times
> longer than the loop with the in-place mutator -- and it only gets
> worse, as we're talking about O(N squared) vs O(N) performance! Indeed,
> this is part of what makes reduce an "attractive nuisance"...;-). [[And
> so is sum, if used OTHERWISE than for the documented purpose, computing
> "the sum of a sequence of numbers": a loop with r.extend is similarly
> faster, to concatenate a sequence of lists, when compared to sum(sol,
> [])...!!!]]
Curiously, when I attempt the actual problem at hand (set intersection)
rather than something altogether different (set union) on my Dell laptop
running Win2K Pro, I get the following results:
stmt = 'r=reduce(set.intersection, los)'
best of 3: 21.9 usec per loop
stmt = 'r=intersect_all(los)'
best of 3: 29.3 usec per loop
So, the "nuisance" is more attractive than you thought. Apparently, one
can't really depend on "in place" mutator methods being more efficient
that using functional techniques. BTW, the above figures reflect 10,000
iterations; using larger counts makes the difference even larger. I
suspect that this is because the Python interpreter has fewer chances to
be interrupted by external processes when it's using 'reduce'. This in
turn indicates that both implementations actually work about same and
your "O(n squared)" argument is irrelevant.
P.S. Here's the source I used for the timings:
from timeit import Timer
setup = """
def intersect_all(los):
it = iter(los)
result = set(it.next())
for s in it: result.intersection_update(s)
return result
not7=range(2,11)
not7.remove(7)
los=[set(range(0,360,step)) for step in not7]
"""
number = 10000
repeat = 3
precision = 3
for stmt in 'r=reduce(set.intersection, los)', 'r=intersect_all(los)':
t = Timer(stmt, setup)
r = t.repeat(repeat, number)
best = min(r)
usec = best * 1e6 / number
print "stmt =", repr(stmt)
print "best of %d: %.*g usec per loop" % (repeat, precision, usec)
More information about the Python-list
mailing list