True inconsistency in Python

Alex Martelli aleax at aleax.it
Fri Nov 14 11:14:55 EST 2003


Gerrit Holl wrote:

> Alex Martelli wrote:
> (about x vs. bool(x))
>> [[ we TOLD Guido people would start on this absurd path, when
>>    he added bool, True, and False, but he wouldn't listen...
>> ]]
> 
> What do double square brackets mean?

"Inline footnote" or the like;-).  I do like to pep up my posts
despite the limitations of plain ascii...


> (btw, I _do_ use bool ocasionally: I implemented an anytrue() function
>  using sum([bool(i) for i in L]), is this wrong?)

Not wrong, just potentially a bit slow for very large L's, but for
your applications you may not care.  Compare:


[alex at lancelot tmp]$ timeit.py -c -s'xs=[0]*999' -s'
> def anytrue_sc(L):
>     for i in L:
>         if i: return True
>     else:
>         return False
> def anytrue_ns(L):
>     return sum([ bool(i) for i in L ])
> ' 'anytrue_sc(xs)'
1000 loops, best of 3: 200 usec per loop

and the same with 'anytrue_ns(xs)' instead,

1000 loops, best of 3: 1.26e+03 usec per loop

while, also, with 'anytrue_sc([1]+xs]':

10000 loops, best of 3: 22 usec per loop

while with 'anytrue_ns([1]+xs]':

1000 loops, best of 3: 1.29e+03 usec per loop


So, on a 1000-long list, you may be slowing things down by 6 to
60 times, depending on where (if anywhere) the first true item
of L might be, by using the "clever" bool rather than the plain
good old loop-with-shortcircuit.

Now, for some people "one-liners" (which you could write the _ns
version as; the _sc version can't be less than 4 lines) are much
more important than speed or elementary-clarity.  For those
people, I suggest an approach that's only a BIT slower than the
simplest and most obvious one:

[alex at lancelot tmp]$ timeit.py -c -s'xs=[0]*999' -s'import itertools as it
> import operator
> def anytrue_it(L):
>   return list(it.islice(it.dropwhile(operator.not_, L),1))
> ' 'anytrue_it(xs)'
1000 loops, best of 3: 310 usec per loop

and with 'anytrue_it([1]+xs)', 28 usec per loop.  Now, you have
bounded your performance loss to 30-50% when compared to the simplest
approach, and one can of course pull out all the stops...:

[alex at lancelot tmp]$ timeit.py -c -s'xs=[0]*999' -s'import itertools as it
import operator
def anytrue_it(L, slice=it.islice, drop=it.dropwhile, not_=operator.not_):
  return list(slice(drop(not_, L),1))
' 'anytrue_it(xs)'
1000 loops, best of 3: 270 usec per loop

to reach a 35%-or-so max loss of performance in this case.

Plus, it's much cooler for the purpose of making your friends' heads
spin with admiration for your wizardry, of course;-).


Alex





More information about the Python-list mailing list