map is useless!

Sun Jun 6 21:45:41 EDT 2010

On 6/6/2010 7:20 PM, Steven D'Aprano wrote:
> On Sun, 06 Jun 2010 08:16:02 -0700, rantingrick wrote:
>
>> Everyone knows i'm a Python fanboy so nobody can call me a troll for
>> this...
>
> The first rule of trolling is, always deny being a troll, no matter how
> obvious the trolling.

Such as the exagerated-claim subject that ends with an exclamation!

> But on the chance I'm wrong, and for the benefit of
> others, your tests don't measure what you think they are measuring and
> consequently your results are invalid. Read on.

+1 on the rest. Thanks for posting it. I have nothing more to add.

Terry Jan Reedy

>> Python map is just completely useless. For one it so damn slow why even
>> bother putting it in the language? And secondly, the total "girl- man"
>> weakness of lambda renders it completely mute!
>
> Four trolls in three sentences. Way to go "fanboy".
>
> (1) "Completely" useless? It can't do *anything*?
>
> (2) Slow compared to what?
>
> (3) Are you implying that map relies on lambda?
>
> (4) What's wrong with lambda anyway?
>
> By the way, nice sexist description there. "Girl-man weakness" indeed.
> Does your mum know that you are so contemptuous about females?
>
>
>
>> Ruby has a very nice map
>
> I'm thrilled for them. Personally I think the syntax is horrible.
>
>
>>>>> [1,2,3].map{|x| x.to_s}
>>
>> Have not done any benchmarking
>
> "... but by counting under my breath while the code runs, I'm POSITIVE
> Ruby is much faster that Python!"
>
> By complaining about Python being too slow while admitting that you
> haven't actually tested the speed of your preferred alternative, you have
> *negative* credibility.
>
>
>> but far more useful from the programmers
>> POV. And that really stinks because map is such a useful tool it's a
>> shame to waste it. Here are some test to back up the rant.
>>
>>
>>>>> import time
>>>>> def test1():
>> 	l = range(10000)
>> 	t1 = time.time()
>> 	map(lambda x:x+1, l)
>> 	t2= time.time()
>> 	print t2-t1
>
> That's a crappy test.
>
> (1) You include the cost of building a new function each time.
>
> (2) You make no attempt to protect against the inevitable variation in
> speed caused by external processes running on a modern multi-process
> operating system.
>
> (3) You are reinventing the wheel (badly) instead of using the timeit
> module.
>
>
>>>>> def test2():
>> 	l = range(10000)
>> 	t1 = time.time()
>> 	for x in l:
>> 		x + 1
>> 	t2 = time.time()
>> 	print t2-t1
>
> The most obvious difference is that in test1, you build a 10,000 item
> list, while in test2, you don't. And sure enough, not building a list is
> faster than building a list:
>
>>>>> test1()
>> 0.00200009346008
>>>>> test2()
>> 0.000999927520752
>
>
>
>>>>> def test3():
>> 	l = range(10000)
>> 	t1 = time.time()
>> 	map(str, l)
>> 	t2= time.time()
>> 	print t2-t1
>>
>>
>>>>> def test4():
>> 	l = range(10000)
>> 	t1 = time.time()
>> 	for x in l:
>> 		str(x)
>> 	t2= time.time()
>> 	print t2-t1
>>
>>
>>>>> test3()
>> 0.00300002098083
>>>>> test4()
>> 0.00399994850159
>
>
> Look ma, not building a list is still faster than building a list!
>
>
>> So can anyone explain this poor excuse for a map function? Maybe GVR
>> should have taken it out in 3.0?  *scratches head*
>
>
> So, let's do some proper tests. Using Python 2.6 on a fairly low-end
> desktop, and making sure all the alternatives do the same thing:
>
>>>> from timeit import Timer
>>>> t1 = Timer('map(f, L)', 'f = lambda x: x+1; L = range(10000)')
>>>> t2 = Timer('''accum = []
> ... for item in L:
> ...     accum.append(f(item))
> ...
> ... ''', 'f = lambda x: x+1; L = range(10000)')
>>>>
>>>> min(t1.repeat(number=1000))
> 3.5182700157165527
>>>> min(t2.repeat(number=1000))
> 6.702117919921875
>
> For the benefit of those who aren't used to timeit, the timings at the
> end are the best-of-three of repeating the test code 1000 times. The time
> per call to map is 3.5 milliseconds compared to 6.7 ms for unrolling it
> into a loop and building the list by hand. map is *much* faster.
>
> How does it compare to a list comprehension? The list comp can avoid a
> function call and do the addition inline, so it will probably be
> significantly faster:
>
>>>> t3 = Timer('[x+1 for x in  L]', "L = range(10000)")
>>>> min(t3.repeat(number=1000))
> 2.0786428451538086
>
> And sure enough it is. But when you can't avoid the function call, the
> advantage shifts back to map:
>
>>>> t4 = Timer('map(str, L)', "L = range(10000)")
>>>> t5 = Timer('[str(x) for x in  L]', "L = range(10000)")
>>>> min(t4.repeat(number=1000))
> 3.8360331058502197
>>>> min(t5.repeat(number=1000))
> 6.6693520545959473
>
>
>
> Lessons are:
>
> (1) If you're going to deny being a troll, avoid making inflammatory
> statements unless you can back them up.
>
> (2) Understand what you are timing, and don't compare apples to snooker
> balls just because they're both red.
>
> (3) Timing tests are hard to get right. Use timeit.
>
> (4) map is plenty fast.
>
>
> Have a nice day.
>
>