Just wondering

Mon May 18 12:51:47 EDT 2009

Dave Angel wrote:
> norseman wrote:
>> <div class="moz-text-flowed" style="font-family: -moz-fixed">Marco 
>> Mariani wrote:
>>> Gediminas Kregzde wrote:
>>>
>>>
>>>> def doit(i):
>>>>    pass
>>>>
>>>> def main():
>>>>    a = [0] * 10000000
>>>>    t = time()
>>>>    map(doit, a)
>>>>    print "map time: " + str(time() - t)
>>>
>>> Here you are calling a function ten million times, build a list with 
>>> of ten million None results, then throw it away.
>>>
>>>
>>>> def main2():
>>>>    t = time()
>>>>    a = [0] * 10000000
>>>>    for i in a:
>>>>        pass
>>>>    print "loop time: " + str(time() - t)
>>>
>>> Here you do nothing but iterating 'i' over the 'a' list.
>>>
>>>
>>>> main()  # takes approximately 5x times longer than main2()
>>>> main2()
>>>>
>>>> I'm wondering were is catch?
>>>
>>> Function calls are not free in python. They cost a lot more than they 
>>> do in C, Delphi or other languages.
>>>
>>
>> Gediminas - You did a good job of testing the overhead.  All good 
>> programmers do that!
>>
>> Steve
>>
>>
> If the task is to test the overhead, then the two loops need to be 
> equivalent.

no they don't. see below

   The OP's conclusion, confirmed by many of you, is that
> map() is lots slower than for.  

is it? see below

But if you add in the call overhead in
> the for loop, they come out within 2 percent.  If you just want to 
> compare map() time to for time, use functions:
> 
> from time import time
> def doit(i):
>   pass
> def main():
>   a = [0] * 10000000
>   t = time()
>   map(doit, a)
>   print "map time: " + str(time() - t)
> 
> def main2():
>   a = [0] * 10000000
>   t = time()
>   for i in a:
>       doit(i)
>   print "loop time: " + str(time() - t)
> 
> 
> main()
> main2()
> 
> gives:
> map time: 4.875
> loop time: 4.79600000381
> 
> 
================================================
Why do tests have to be equal?  If what you are after is to test several 
things just to see what they carry for overhead just do so.  Whether or 
not they are "equal" is a matter of choice.  Doing the map() implies it 
will store something. Doing the loop tells what it takes to loop. THEN, 
by adding your code to each to balance things out and running for time 
will tell you things.  Maybe map() will win and maybe it will not for 
certain objectives. Maybe loop will win under certain conditions. 
Without a baseline and testing - any statement as to which is better is 
simply unrealistic nonsense.  The 'internals', algorithms, can often be 
optimized after inspection to obtain better results.  Whether or not 
these optimizations will interfere with a another function (in this case 
map() ) remains to be tested.  Just because the minimal overhead on A is 
less than B does not mean that A is the better choice for a given situation.

By the way, since any loop action actually needs some sort of counter, 
be it the here to end of file or a numeric countdown, I put my timer 
before def main? and check the whole test of minimal parts.  People who 
write compilers and/or interpreters often do strange things with 
optimization concepts.

Dave's' times point out that loop has a good chance to win if the rest 
of that sections code were present. By that I mean the actual function 
the loop or map() were to preform in a given setting.  It shows the 
function call is expensive.  Thus in-line code can be a better choice 
when speed is needed. At least in Python and given the results of both 
tests of loop above.

Also depicted is that map() requires the expensive function call and 
loop does not in order to process something.

the def doit(i):
   pass
should result in something like:

.
.
push data_address in R2
call address_of_doit
.
.

address_of_doit
pop data_address to R1              #see note 1
return                              #see note 2

note 1:  not all OS/architectures  implement push/pop
          whatever the store/call  retrieve/(store)/return  may be,
          it will be the same throughout for that OS/machine.

note 2:  a call to a dummy will show the efficiency of calling.
          normally - its overhead is quite minimal.  As another stated
          above - such things are indeed expensive in Python.

All this generated by someone posting a simple overhead test.  WOW.
I guess I had a better teacher than I thought. :)

IMHO - Dave's loop test should have been included in the original 
posting. Making three tests all together at that time.  But then we all 
had different teachers.  Which is why the question came up.

Steve