Threading/map question

Florian Schulze florian.proff.schulze at gmx.net
Thu Dec 5 15:45:19 EST 2002


On Thu, 5 Dec 2002 15:45:54 GMT Michael Hudson <mwh at python.net> wrote:

> anandpillai6 at yahoo.com (Anand) writes:
> 
> > A question about function performance.
> > 
> > I have a list of integers on which an operation is to be performed,
> > say an addition or subtraction. Since the list is huge (len >= 100000)
> > ,I can do it in two ways (I thought)
> 
> OK.
> 
> >  I found that the thread solution takes lesser time than the map
> > solution. Can this be a general approach for all data structures in
> python
> > or does it apply only to lists.
> 
> Asking if threading or map gives better performance seems to me like
> asking whether mashed potatoes or tofu are better at keeping one dry.
> 
> However, 
> 
>     ldataout = [None]*len(ldatain)
>     for i in range(len(ldatain)):
>         ldataout[i] = 255 - ldatain[i]
> 
> is likely to be quicker than
> 
>     ldataout = map(lambda x: 255 - x, ldatain)
> 
> because you don't have to take a trip through the function call
> machinery for each element of the list in the first case.
> 
> Cheers,
> M.
> 
> -- 
>   The ultimate laziness is not using Perl.  That saves you so much
>   work you wouldn't believe it if you had never tried it.
>                                         -- Erik Naggum, comp.lang.lisp


import Numeric

def use_list(ldatain):
    ldataout = [None]*len(ldatain)
    for i in range(len(ldatain)):
        ldataout[i] = 255 - ldatain[i]
    return ldataout

def use_map(ldatain):
    ldataout = map(lambda x: 255 - x, ldatain)
    return ldataout

def use_numeric1(ldatain):
    ldataout = 255 - ldatain
    return ldataout

def use_numeric2(ldatain):
    ldataout = 255 - Numeric.asarray(ldatain)
    return ldataout

def t():
    datalen = 100000
    use_list(range(datalen))
    use_map(range(datalen))
    use_numeric1(Numeric.arange(datalen))
    use_numeric2(range(datalen))

if __name__ == '__main__':
    import profile
    profile.run("t()")

-+-+-+-+-+-+-

          100008 function calls in 7.741 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    7.740    7.740 <string>:1(?)
        1    0.103    0.103    0.103    0.103 Numeric.py:129(asarray)
   100000    3.146    0.000    3.146    0.000 map_vs_array.py:10(<lambda>)
        1    0.008    0.008    0.008    0.008 map_vs_array.py:13(use_numeric1)
        1    0.006    0.006    0.109    0.109 map_vs_array.py:17(use_numeric2)
        1    0.150    0.150    7.740    7.740 map_vs_array.py:21(t)
        1    0.292    0.292    0.292    0.292 map_vs_array.py:3(use_list)
        1    4.035    4.035    7.181    7.181 map_vs_array.py:9(use_map)
        0    0.000             0.000          profile:0(profiler)
        1    0.001    0.001    7.741    7.741 profile:0(t())

-+-+-+-+-+-+-

Numeric is the clear winner. Even if you convert the array to a numeric
array first it's faster. And map really is slower in this case. I wonder if
map could be speeded up with some builtin operator if that's possible at all.

Regards,
Florian








More information about the Python-list mailing list