Filter versus comprehension (was Re: something about split()???)
Ramchandra Apte
maniandram01 at gmail.com
Fri Aug 24 10:44:27 EDT 2012
On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy wrote:
> On 8/22/2012 3:30 AM, Mark Lawrence wrote:
>
> > On 22/08/2012 06:46, Terry Reedy wrote:
>
> >> On 8/21/2012 11:43 PM, mingqiang hu wrote:
>
> >>> why filter is bad when use lambda ?
>
> >>
>
> >> Inefficient, not 'bad'. Because the equivalent comprehension or
>
> >> generator expression does not require a function call.
>
>
>
> for each item in the iterable.
>
>
>
> > A case of premature optimisation? :)
>
>
>
> No, as regards my post. I simply made a factual statement without
>
> advocating a particular action.
>
>
>
> filter(lambda x: <expr>, iterable)
>
> (x for x in iterable if <expr>)
>
>
>
> both create iterators that produce the items in iterable such that
>
> bool(<expr>) is true. The following, with output rounded, shows
>
> something of the effect of the extra function call.
>
>
>
> >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
>
> 0.91
>
> >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
>
> 1.28
>
> >>> timeit.timeit("list(filter(lambda i: False, ranger))",
>
> "ranger=range(0)")
>
> 0.83
>
> >>> timeit.timeit("list(filter(lambda i: False, ranger))",
>
> "ranger=range(20)")
>
> 2.60
>
>
>
> Simply keeping true items is faster with filter -- at least on my
>
> particular machine with 3.3.0b2.
>
>
>
> >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")
>
> 1.03
>
>
>
> Filter is also faster if the expression is a function call.
>
>
>
> >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20);
>
> f=lambda i: False")
>
> 2.5033614114454394
>
> >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20);
>
> f=lambda i: False")
>
> 3.2394095327040304
>
>
>
> ---
>
> Perhaps or even yes as regards the so-called rule 'always use
>
> comprehension'. If one prefers filter as more readable, if one only
>
> wants to keep true items, if the expression is a function call, if
>
> evaluating the expression takes much more time than the extra function
>
> call so the latter does not matter, if the number of items is few enough
>
> that the extra time does not matter, then the rule is not needed or even
>
> wrong.
>
>
>
> So I think PyLint should be changed to stop its filter fud.
>
>
>
> --
>
> Terry Jan Reedy
When filtering for true values, filter(None,xxx) can be used
Your examples with lambda i:False are unrealistic - you are comparing `if False` vs <lambda function>(xx) - function call vs boolean check
More information about the Python-list
mailing list