[Python-Dev] Fwd: summing a bunch of numbers (or "whatevers")

Tim Peters tim.one@comcast.net
Tue, 22 Apr 2003 23:50:58 -0400


[Alex Martelli]
> Hmmm, I think I must be missing something here.  Surely in many
> application cases a loop exploiting short-circuiting behavior will have
> better expected performance than anything that's going all the way
> through the sequence no matter what?

No, you're only missing that I seem rarely to have apps where it actually
matters.

> Far greater variance, sure, and if the probability of true items gets
> extreme enough then the gain from short-circuiting will evaporate,

Or, more likely, become a pessimization (liability).

> but...:
>
> [alex@lancelot src]$ ./python Lib/timeit.py -s'seq=[i%2 for i in
> range(9999)]'
> -s'''
> > def any(x):
> >   for xx in x:
> >     if xx: return True
> >   return False
> > ''' 'any(seq)'
> 1000000 loops, best of 3: 1.42 usec per loop
>
> [alex@lancelot src]$ ./python Lib/timeit.py -s'seq=[i%2 for i in
> range(9999)]'
> -s'''
> def any(x):
>   return bool(filter(None,x))
> ''' 'any(seq)'
> 1000 loops, best of 3: 679 usec per loop
>
> ...i.e., despite filter's amazing performance, looping over 10k
> items still takes a bit more than shortcircuiting out at once;-).

It's only because Guido sped up loops for 2.3 <wink>.

> If Python ever gains such C-coded functions as any, all, etc (hopefully
> in some library module, not in builtins!) I do hope and imagine they'd
> short-circuit, of course.  BTW, I think any should return the first
> true item (or the last one if all false, or False for an empty sequence)
> and all should return the first false item (or the last one if all true,
> or True for an empty seq) by analogy with the behavior of operators
> and/or.

I agree that getting the first witness (for "any") or counterexample (for
"all") can be useful.  I'm not sure I care what it returns if all are false
for "any", or all true for "all".  If I don't care, they're easy to write
with itertools now:

"""
import itertools

def all(seq):
    for x in itertools.ifilterfalse(None, seq):
        return x # return first false value
    return True

def any(seq):
    for x in itertools.ifilter(None, seq):
        return x # return first true value
    return False

print all([1, 2, 3]) # True
print all([1, 2, 3, 0, 4, 5]) # 0, the first counterexample

print any([0, 0, 0, 0]) # False
print any([0, 42, 0, 0]) # 42, the first witness
"""

I liked ABC's quantified boolean expressions:

    SOME x IN collection HAS bool_expression_presumably_referencing_x
    EACH x IN collection HAS bool_expression_presumably_referencing_x
    NO x IN collection HAS bool_expression_presumably_referencing_x

The first left x bound to the first witness when true.  ABC didn't have
boolean data values -- these expressions could only be used in control-flow
statements (like IF).  x was then a block-local binding in the block
controlled by the truth of the expression, so there was no question about
what to do with x when the expression was false (you didn't enter the block
then, so couldn't reference the block-local x).

The second and third left x bound to the first counterexample when the
expression was false, and in those cases x was local to the ELSE clause.

I viewed that as finessing around a question that shouldn't be asked, via
the simple expedient of making the question unaskable <wink>.  The exact
rules were pretty complicated, though.