[Tutor] Aschenputtel problem - Shorter is not better?
Pierre Barbier de Reuille
pierre.barbier at cirad.fr
Fri Sep 16 19:17:22 CEST 2005
Well, I have some comments ^_^
First, about the function 6, it is exactly equivalent (and a little bit
quicker) to use:
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
def cinderella6(original_list, condition):
"""Using while and list popping."""
list1 = []
i = 0
try:
while True:
if not condition(original_list[i]):
list1.append(original_list.pop(i))
else:
i += 1
except IndexError:
pass
return original_list, list1
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
That is, put the try block outside the loop ... however, if this method
is the quickest it is the only destructive one !
Then, you shouldn't time the creation of the list together with
algorithm. Thus the main loop should be:
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
def test(list_length=100, num_func_calls=1000):
"""Profile different implementations of cinderella idiom."""
global olist, rlist
olist = range( list_length )
rlist = array( olist )
l = []
for i in range(1,10):
func = "cinderella%i" % i
if i == 8:
stmt = """%s(rlist, check)""" % (func,)
else:
stmt = """%s(olist, check)""" % (func,)
timer = timeit.Timer(stmt,"from __main__ import %s, check,
array, compress, olist, rlist" % func)
l.append(timer.repeat(3, int(num_func_calls)))
print "%s: (1) %f, (2) %f, (3) %f" % (func, l[-1][0], l[-1][1],
l[-1][2])
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
Well, I added the creation of a Numeric array ... as the function 8 is
supposed to work with it. So here are the corrected versions of the
functions 7 et 8 :
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
def cinderella7(original_list, condition):
"""Using Numeric and array compression."""
selection = array( [condition(i) for i in original_list] )
list1 = compress(selection, original_list)
list2 = compress(~selection, original_list)
return list1, list2
def cinderella8(original_list, condition):
"""Using Numeric and array compression."""
selection = condition( original_list )
list1 = compress(selection, original_list)
list2 = compress(~selection, original_list)
return list1, list2
===8<======8<======8<======8<======8<======8<======8<======8<======8<===
In the end, Numeric rocks :) Solution 8 is even faster than solution 6 !
List length: 100, Function calls: 3 x 1000
cinderella1: (1) 0.228345, (2) 0.356115, (3) 0.229700
cinderella2: (1) 0.306550, (2) 0.308528, (3) 0.312639
cinderella3: (1) 0.469183, (2) 0.471858, (3) 0.474165
cinderella4: (1) 0.311316, (2) 0.327631, (3) 0.449626
cinderella5: (1) 0.309293, (2) 0.305037, (3) 0.293698
cinderella6: (1) 0.096513, (2) 0.095803, (3) 0.096810
cinderella7: (1) 0.661629, (2) 0.786803, (3) 0.673522
cinderella8: (1) 0.197300, (2) 0.198183, (3) 0.198790
List length: 1000, Function calls: 3 x 1000
cinderella1: (1) 2.248727, (2) 2.267243, (3) 2.301675
cinderella2: (1) 3.023581, (2) 3.046691, (3) 3.076899
cinderella3: (1) 29.806438, (2) 28.742620, (3) 31.848374
cinderella4: (1) 3.037407, (2) 3.162915, (3) 3.227520
cinderella5: (1) 4.260166, (2) 4.831273, (3) 3.991498
cinderella6: (1) 0.989976, (2) 1.346485, (3) 0.938494
cinderella7: (1) 5.095143, (2) 6.546663, (3) 6.203648
cinderella8: (1) 0.686484, (2) 0.686852, (3) 0.688778
List length: 10000, Function calls: 3 x 1000
cinderella6: (1) 8.399953, (2) 8.712989, (3) 8.998076
cinderella8: (1) 6.499856, (2) 6.388803, (3) 6.108556
Pierre
Christopher Arndt a écrit :
> Terry Kemmerer schrieb:
>
>>
>>"Bearing in mind that shorter is not necessarily better..."
>>
>>Wouldn't shorter, as a rule of thumb, as in less language statements, mean fewer
>>executions, and therefore faster?
>
>
> Well, see for yourself...
>
> I wrote a little benchmarking script for different solutions to my problem. It
> turns out, that the original "traditional for-loop" approach is not only the
> most readable but also one of the fastest.
>
> Here are my timings (Python 2.4, Athlon64 1.8 Ghz, Ubuntu Hoary) (see the
> attached script for the functions names):
>
> List length: 100, Function calls: 3 x 1000
>
> cinderella1: (1) 0.188769, (2) 0.189790, (3) 0.188901
> cinderella2: (1) 0.263702, (2) 0.254624, (3) 0.254050
> cinderella3: (1) 0.430694, (2) 0.270807, (3) 0.265921
> cinderella4: (1) 0.154761, (2) 0.142940, (3) 0.139610
> cinderella5: (1) 0.145273, (2) 0.139491, (3) 0.133930
> cinderella6: (1) 0.144191, (2) 0.139056, (3) 0.141840
> cinderella7: (1) 0.737320, (2) 0.726961, (3) 0.732526
>
> List length: 1000, Function calls: 3 x 1000
>
> cinderella1: (1) 1.639245, (2) 0.975745, (3) 0.971775
> cinderella2: (1) 1.337324, (2) 1.330888, (3) 1.319431
> cinderella3: (1) 18.772992, (2) 18.764382, (3) 18.759283
> cinderella4: (1) 1.140375, (2) 1.120827, (3) 1.119281
> cinderella5: (1) 1.460923, (2) 1.444996, (3) 1.444842
> cinderella6: (1) 1.481483, (2) 1.469315, (3) 1.478990
> cinderella7: (1) 7.211108, (2) 7.208998, (3) 7.202029
>
> "cinderella1" ist the traditional approach, "cinderella3" builds up a list of
> positives and then checks that against the original list by using the "in"
> operator. This scales very badly for longer lists, as you can see in the second
> timing.
>
> "cinderella7" ist the Numeric approach proposed by Pierre Barbier de Reuille.
> It doesn't compare very well, probably because it has to iterate 3 times over
> the original list and once more over the list of positives.
>
> The solution using sets ("cinderella4") seems to do very well too, but has
> several limitations:
>
> - does not preserver order of elements
> - elements must be unique
> - only available from Python 2.3
>
> I leave it as an axcercise to the reader to try how the different soltutions
> behave when condition(item) is more expansive (i.e takes longer).
>
> Chris
>
>
--
Pierre Barbier de Reuille
INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France
tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68
More information about the Tutor
mailing list